AI-Generated Steganography

New research suggests that AIs can produce perfectly secure steganographic images:

Abstract: Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in developing scalable steganography techniques. In this work, we show that a steganography procedure is perfectly secure under Cachin (1998)’s information theoretic-model of steganography if and only if it is induced by a coupling. Furthermore, we show that, among perfectly secure procedures, a procedure is maximally efficient if and only if it is induced by a minimum entropy coupling. These insights yield what are, to the best of our knowledge, the first steganography algorithms to achieve perfect security guarantees with non-trivial efficiency; additionally, these algorithms are highly scalable. To provide empirical validation, we compare a minimum entropy coupling-based approach to three modern baselines—arithmetic coding, Meteor, and adaptive dynamic grouping—using GPT-2, WaveRNN, and Image Transformer as communication channels. We find that the minimum entropy coupling-based approach achieves superior encoding efficiency, despite its stronger security constraints. In aggregate, these results suggest that it may be natural to view information-theoretic steganography through the lens of minimum entropy coupling.

News article.

EDITED TO ADD (6/13): Comments.

Tags: academic papers, artificial intelligence, cryptography, encryption, steganography

Posted on June 12, 2023 at 7:18 AM • 29 Comments

Comments

Clive Robinson • June 12, 2023 10:44 AM

@ ALL,

Re : The article does not explain…

If you read the article it’s clear the journalist does not actually understand what they are writting about, sufficiently to explain it clearly.

They repeatedly say “minimum entropy coupling” as though it’s an incantation that will magically convay meaning to a reader…

Also you will find to things to ponder,

1, In order to come up with a new message indistinguishable from the original, innocuous one, you have to create a perfect simulation of the cover text distribution… …For human-generated text, this is not feasible… For that reason, perfectly secure steganography has long seemed out of reach.

2, But machine-generated text, of course, is not created by humans. The recent rise of generative models that focus on language, or others that produce images or sounds, suggests that perfectly secure steganography might be possible in the real world.

Both are “general case statements” which are “mostly but not always true”…

There are ways human generated text can carry a steganographic channel without it being detected by statistics (I’ve shown this in the past on this blog a number of times when arguing about the impossability of governments stoping crypto but still alowing communications, so why “back doors” will not work).

Likewise there are ways LLM generated text will show by statistics that there is a non negligable probability it has a steganographic channel within it.

The trick behind this “minimum entropy coupling” is to get the “statistical curves” to be as indentical as possible, thus no test becomes posible…

In essence the way to do that simplistically is with a random source with flat distribution. Or more formally a fixed phrase with a stocastic element… You might call a “Stocastic Parrot” if your mind want’s to go that way (and some do with LLM’s)..

A very simplistic way to describe this so you can get an idea is to have a stock phrase such as,

“We should meetup for a XXX”

Where the XXX is a word randomly selected from a list of words that have equal probability. Such as,

{drink, beer, tea, coffee, sandwich, etc…}

As long as the selection is random no one phrase generated has any more or less meaning than any other, if any, thus “all are equiprobable” which is the basis of Shannon “Perfect secrecy”.

How to make the selection “random” or “stochastic” to an observer, but not an intended recipient is actually a bit harder (actually almost impossibly hard the more general the language used).

Which is why when I previously described it I fell back on an open unicity distance, of Shannon’s “Perfect Secrecy”.

Now some of you are probably scratching your heads about LLM’s and why they might be good for this.

Well as I’ve said before LLM’s are realy just “noise shaping filters” that have massively parallel filter paths any one of which might be stocasticaly selected according to a fixed probability. The statistics of the filter, provided the weights are not changed, will remain the same regardless of the noise put in. If the noise starts off with a flat distribution then the ouput from the LLM filter will always have the same statistical charecteristics, no matter how often you run it.

The use of “Shannon perfect secrecy” –AKA OTP– means the observer can not pick up any statistical inference…

So hopefully that fills in a few gaps the article author jumped over.

Oh “minimum entropy coupling” is quite a new spin on an old idea. Thus you might have trouble looking it up, and if you do find a paper, you might find it a little tough getting your head around the language…

But I can assure you the idea if written up sensibly for a beginer is fairly easy to understand (you just need sufficient column inches).

Godel Fishbreath • June 12, 2023 11:09 AM

I wonder what is encrypted in the above texts? The reporter’s and the commentator’s?

and hence the • June 12, 2023 12:37 PM

About Me; LINK:

and

About Bruce Schneier pictures:

vas pup • June 12, 2023 5:59 PM

What about steganographic sounds/music?

Not Rick • June 12, 2023 8:24 PM

Are you asking to get crypto-rickrolled?

Phillip • June 12, 2023 9:40 PM

@Clive Robinson

I am curious whether AI is sometimes a stand-in for any thought experiment meaning, “massively computational construct.”

It is popular with any AI-seeking wonderment.

Sure, if you riddle it enough, you get to say “AI.” This is an equation, though maybe not too much buffered nuance rides along with this.

Remember BQ, the Buzzword Quotient?

Phillip

Clive Robinson • June 12, 2023 10:10 PM

@ Phillip, ALL,

Re : AI currently is DSP no more no less.

‘I am curious whether AI is sometimes a stand-in for any thought experiment meaning, “massively computational construct.”‘

As I’ve pointed out a few times, I’ve been involved on and off with Digital Signal Processing since the late 1970’s early 1980’s with stuff on my Apple ][.

If you look at DSP as a basic building block it consisys of two basic parts,

1, A sumation corcuit using “Multiply and ADd”(MAD) instructions.
2, Storage registers used as time delay elements.

Used together these give “filter functions” based around Z^-1. A third element of a “Sigmoid” or similar function is used to stop or limit the effects of getting “to close” to the limits. These are called “limiters” or “normalozers” and are used in some DSP circuits as “compression functions”.

If you look at neural networks the “neurons” are the same three circuits only the storage register is left out of the diagram (but is there in software).

The big difference is realy only the number of inputs to the sumation circuit. In DSP circuits these are usually less than six inputs. However in current LLM AI, the neural networks have billions of inputs.

So in a way confirming your,

“massively computational construct.”

As I only half jokingly say LLM AI can be likened to running a wet finger tip around the rim of a glass. The noise the finger makes is random of relatively flat probability. However the glass acts as a tuned filter of sufficiently narrow bandwidth –and hiwh Q– it’s considerd an energy storage device over a very narrow frequency range or “resonator”. In effect it’s like a gong that gets repeated gentle taps building up the energy at the gongs natural frequency.

Imagine now several million such resonators interconnected in various ways. That is what your LLM neural net effectively is.

ResearcherZero • June 12, 2023 11:30 PM

Most people just do not understand what AI is or how it works.

A rather common phenomenon with many types of technology and the reason for why some do not service devices themselves, less they damage it or themselves.

It’s probably better to call an electrician, than crawl into the roof space with the mains still powered. Remove a load bearing wall and it may permanently shift the weight distribution of the dwelling for example. Carve up that asbestos wall, have a slog at that septic plumbing, rewire those outlets and add a series of light sockets while you are at it. What could possibly go wrong?

ResearcherZero • June 12, 2023 11:43 PM

“The joint distribution can ensure that the two texts are statistically indistinguishable, generating a perfectly secure message.”

Ted • June 13, 2023 12:00 AM

@Clive

I enjoyed reading your comments. It’s helpful to look at the material through the eyes of someone with a deep technical background.

A few others, including Matthew Green and even one of the paper’s co-authors, also offered a few thoughts here:

https://news.ycombinator.com/item?id=36022598

@all

I’ve seen a few remarks that the paper is fairly theoretical. Though I’m also reading that the research team has filed a patent for the algorithm.

Are such patents made public, and would they provide more explicit details about the algorithm’s construction?

Simply for educational purposes, it’d be pretty neat to see a demo of the research.

ResearcherZero • June 13, 2023 12:30 AM

You would have to be able to first detect the message before you could intercept the message. Assuming good implementation, and behavior of the target, this might be quite difficult.

Any method to then attack the problem introduces it’s own complexities which should be self evident…

“By treating each node of the neural network as a function, Taylor decompositions can be used to propagate the function value backward onto the input variables, such as pixels of an image. What results, in the case of image categorization, is an image with the output label redistributed onto input pixels—a visual map of the input pixels that contributed to the algorithm’s final decision.”
http://heatmapping.org/

‘https://en.wikipedia.org/wiki/Ciphertext-only_attack

‘https://www.wired.com/story/what-is-side-channel-attack/

just some dsp engineer • June 13, 2023 4:41 AM

As I’ve pointed out a few times, I’ve been involved on and off with Digital Signal Processing since the late 1970’s early 1980’s

In what way is that relevant ? Tens of thousands of engineers were doing DSP at that time, at a more professional level than tinkering with an Apple ][. And many more of them today.

AI is currently DSP no more no less.
…

While that is not wrong if you just look at the basic arithmetic operations, it’s also irrelevant as it doesn’t explain anything at all.

You could as well say that all of cryptography is just the combination of basic logic functions like AND, OR, and XOR. That doesn’t explain why some such combinations result in a useful cipher while most others don’t.

In DSP circuits these are usually less than six inputs.

Where do you get that number ? Even a simple DFT could combine thousands of inputs via MAC operations.

In DSP algorithms are designed starting from mathematical equations that define a specific required result. That is true even for things like adaptive filters and MIMO networks which probably come closest to what happens when a neural network is trained. Non-linear elements are used sparingly, and for a well defined specific purpose.

There is no such design method for AI, all that is done is to devise an architecture of layers that hopefully will be able to ‘learn’ from the type of data you want to throw at it. The use of non-linear elements after each summation is essential to make things work, but non of them has a defined specific function.

Clive Robinson • June 13, 2023 12:06 PM

@ just some dsp engineer, ALL,

Re : Your claims don’t realy hold water.

“Tens of thousands of engineers were doing DSP at that time, at a more professional level than tinkering with an Apple ][.”

You are, very obviously from that statment not cognizant of what was going on in general computing let alone Digital Signal Processing in the 1970’s thus at best making uninformed assumptions. The “Home Computing Reveloution” had not started back then, and the then equivalent of the modern Personal Computer was the Apple ][ which is why in the 80’s the IBM “skunk works project” that gave rise to the IBM PC was based very heavily on the Apple ][, right down to a very pervasive vulnerability that still haunts us today as discussed on this blog just a very short while ago.

Likewise,

“Where do you get that number ? Even a simple DFT could combine thousands of inputs via MAC operations.”

The DFT like the FFT is not strictly a “Signal Processing” algorithm. It simply takes a signal in one domain and converts it to another domain that’s all. Mathmatically it’s a multiplier, so also a modulator. To use it to actually process a signal you then need to apply a transformative algorithm a simple example of such is a windowing algorithm. The output from the transformation algorithm is then put through the inverse of the DFT/FFT to get the result back into the required domain.

It’s curious that you appear non cognizant of this, it is after all how a basic radio you would use in the car or at home for receiving broadcast signals works.

As for,

“In DSP algorithms are designed starting from mathematical equations that define a specific required result.”

That has been less and less true this century. I’m not going to go into the nitty gritty of the reasons because it will waste column inches to no general benift. But for you I would say a general text on vectors in computer mathmatics should assist you in furthering your knowledge.

Which brings us to your curious,

“While that is not wrong if you just look at the basic arithmetic operations, it’s also irrelevant as it doesn’t explain anything at all.”

There is much that DSP/AI does that is not explainable by humans. The fact it does usefull work is a bit like the “Donkey Gin” of times past. You can explain the basic mechanics of the gin but you can not describe the basic workhorse of the donkey and how it works. This state of affairs is well known to actual engineers that have sufficient real world experience.

“The use of non-linear elements after each summation is essential to make things work, but non of them has a defined specific function.”

Now that realy did make me laugh. It is a “compression function” to ensure range normalisation either between zero and one for the likes of power spetrum work, or between minus one and one. They are used in many DSP or other vector based systems, where the inputs are unknown but assumed to have certain characteristics. Audio signal processing is one such usage to change the energy per unit of bandwidth to ensure a better transmission of intelligence in an extended Shannon or other real world transmission channel which can be aproximately modeled.

The thing is as others are noting, you keep poping up under different handles to have a go at me amd contribute nothing to the blog. Worse in trying to make your atgument you get things wrong, often baddly wrong… It appears as though you read a bit from a wiki page or similar and then jump in feet first hopong to make a splash. When all you realy do is make a mess…

So a challenge for you and everyone else.

Fundemental to what this discussion is about is this paper,

https://core.ac.uk/download/pdf/82748376.pdf

I only became aware of it yesterday, and had a skim read.

It has a flaw in it, can you spot it and say what it is within 24Hours?

I’m sure even our host @Bruce will find it of interest.

Clive Robinson • June 13, 2023 12:42 PM

@ just some dsp engineer, ALL,

Re : Your claims don’t realy hold water.

“Tens of thousands of engineers were doing DSP at that time, at a more professional level than tinkering with an Apple ][.”

Likewise,

“Where do you get that number ? Even a simple DFT could combine thousands of inputs via MAC operations.”

It’s curious that you appear non cognizant of this, it is after all how a basic radio you would use in the car or at home for receiving broadcast signals works.

As for,

“In DSP algorithms are designed starting from mathematical equations that define a specific required result.”

Which brings us to your curious,

“While that is not wrong if you just look at the basic arithmetic operations, it’s also irrelevant as it doesn’t explain anything at all.”

“The use of non-linear elements after each summation is essential to make things work, but non of them has a defined specific function.”

The thing is as others are noting, you keep poping up under different handles to have a go at me amd contribute nothing to the blog. Worse in trying to make your argument you get things wrong, often baddly wrong… It appears as though you read a bit from a wiki page or similar and then jump in feet first in the hope of makeing a splash. When all you realy do is make a mess of the place…

So to see if you can actually be usefull, a challenge for you and everyone else.

Fundemental to what this discussion is about is this paper,

https://core.ac.uk/download/pdf/82748376.pdf

I only became aware of it yesterday, and had a skim read.

It has a flaw in it, can you spot it and say what it is within 24Hours?

I’m sure even our host @Bruce will find it of interest.

just some dsp engineer • June 13, 2023 2:27 PM

Mr. Robinson,

You are, very obviously from that statment not cognizant of what was going on in general computing let alone Digital Signal Processing in the 1970’s thus at best making uninformed assumptions.

I was doing DSP on dedicated (military) hardware in the mid 1970s.

The DFT like the FFT is not strictly a “Signal Processing” algorithm.

It is probably the most widely used algorithms in DSP.

(about the DFT) it is after all how a basic radio you would use in the car or at home for receiving broadcast signals works.

BS. None of my radios use a DFT to receive broadcast signals. Not even the software defined ones.

There is much that DSP/AI does that is not explainable by humans.

More BS. There is absolutely nothing in DSP that is not explainable.

(about non-linear functions) Audio signal processing is one such usage to change the energy per unit of bandwidth…

More BS.

you keep poping up under different handles to have a go at me

I’d say you need professional help if you think that is true. Judging by the tone of your response you are unable to handle even the mildest criticism or correction in a civilised way. And I don’t think I’m the only one having noticed that.

Clive Robinson • June 13, 2023 7:03 PM

@ just some dsp engineer,

You realy are not with it…

“It is probably the most widely used algorithms in DSP.”

Probably not, but also you are refuting your own earlier statment,

“if you just look at the basic arithmetic operations, it’s also irrelevant as it doesn’t explain anything at all.”

The DFT is a simple “multiply” mathematically and a mixer in a radio does exactly that. Which makes your,

“None of my radios use a DFT to receive broadcast signals. Not even the software defined ones.”

Is as you say “BS”, and shows you have little knoeledge of what is in modern radio chip sets. But realy realy calls into doubt any credability you have especially when you say,

“I was doing DSP on dedicated (military) hardware in the mid 1970s.”

Ahh the old nonsense of,

“I could tell you but then I’d have to kill you because it’s highly clasified”

It always makes me laugh when I see that one bubbling up. As they say,

“Been there, done that, bought the tee-shirt, made the pie and made others swallow it”.

I don’t byte on what should be flushed, and I know peddlers of such are generally full of it to the exclusion of all else.

Which is why you got set a challenge for which the clock is ticking…

As for,

“I’d say you need professional help if you think that is true.”

Maybe you should follow your own advice.

You have not previously posted as “just some dsp engineer” so my saying a different handle, is true if you’ve ever posted before.

Are you denying you’ve ever posted before?

I suspect it’s not just you that knows the answer to that, and I don’t think you want them to speak up as it would reflect badly on you not me.

As for,

“Judging by the tone of your response you are unable to handle even the mildest criticism or correction in a civilised way.”

What you think is mild criticism was questioning my proffesional integrity and it was, as has been demonstrated factually incorrect, or biased to the point of being misleading to others with quite deliberate intent by you. So neither mild or nice.

As I’ve now clearly said I think you are full of it, your demonstrated knowledge is at best highly questionable and therefore I’ve set you a 24hour challenge to “Put up or Shut up” and right now I suspect most can see where you are heading…

Tick tock says the clock as it’hands move forward, or if you are as antiquated as you claim, perhaps “the sands of time they runeth down” might be more your métier.

ResearcherZero • June 14, 2023 3:28 AM

A black-box AI simply spits out solutions without giving reasons for its solutions.

You can feed AI systems input and get output, but you cannot examine the system’s code or the logic that produced the output.

“For example, expert systems, frequently based on decision trees, are perfect models of human decision making and so are naturally understandable by both developers and end-users.”

…“Additionally, if we grow accustomed to accepting AI’s answers without an explanation, essentially treating it as an Oracle system, we would not be able to tell if it begins providing wrong or manipulative answers.”
https://philarchive.org/archive/YAMUAI

The human brain works, and is built quite differently to AI and ML systems. We can do some analysis of AI systems, but not in the same sense that can be done with the brain matter (cultivate neurons, slice up brain, scan areas of the brain).

We can introduce data equivalent to adding a dye to the bloodstream, but the output may not necessarily lead to any greater insight of the internal process taking place. Hence the ‘black box’ analogy, which is due to the architectural differences, which are great.

The brain stores, orders and processes information in a different manner, and the cerebellum handles information in a different manner than a machine processor. Likewise the brain optimises for frequently repeated tasks also in a flexible and different manner than a deep-learning or other AI systems.

An AI system may find the most efficient means and repeat it for example. The way these systems organise and carry out a task is not comparable, as they are not biological systems, and are subject to different constraints and possibilities.

Any of the three components of a machine-learning system can be hidden, even within a ‘glass box’.
https://www.tutorialspoint.com/software_testing_dictionary/glass_box_testing.htm

https://umdearborn.edu/news/ais-mysterious-black-box-problem-explained

https://www.scientificamerican.com/article/demystifying-the-black-box-that-is-ai/

ResearcherZero • June 14, 2023 3:44 AM

Probability Theory

The geometric distribution, intuitively speaking, is the probability distribution of the number of tails one must flip before the first head using a weighted coin.
https://medium.com/analytics-vidhya/probability-distributions-for-feature-engineering-in-data-science-and-machine-learning-1d7d00155d46

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times.
https://en.wikipedia.org/wiki/Multinomial_distribution

Election audits typically test a sample of machine-counted precincts to see if recounts by hand or machine match the original counts. Mismatches result in either a report or a larger recount.
https://en.wikipedia.org/wiki/Hypergeometric_distribution#Multivariate_hypergeometric_distribution

Many programming languages come with implementations to generate pseudo-random numbers which are effectively distributed according to the standard uniform distribution.
https://en.wikipedia.org/wiki/Continuous_uniform_distribution

“a known, finite number of outcomes equally likely to happen”
https://en.wikipedia.org/wiki/Discrete_uniform_distribution

Univariate Distribution Relationships
http://www.math.wm.edu/~leemis/chart/UDR/UDR.html

ResearcherZero • June 14, 2023 4:01 AM

We attempted to predict what Bill Murray may or may not do at any one time using probability theory, but failed, as he is a ‘black box’.

just some dsp engineer • June 14, 2023 4:17 AM

Mr. Robinson,

The DFT is a simple “multiply” mathematically

It’s a lot more than that, as I’m sure you know. Each individual output is a dot product of the input vector with a complex exponential, and you could indeed look at the DFT as a set of N mixers. But not just any mixers, the set is highly structured and that results in some interesting properties.
See for example ‘https://ccrma.stanford.edu/~jos/mdft/DFT_Derived.html. The line just below the title provides one of the most succinct descriptions of the DFT you can find.

Ahh the old nonsense of “I could tell you but then I’d have to kill you because it’s highly clasified”

Nothing classified about it, certainly not today. In fact just an IBM 360 with some extras, processing recorded sonar data. They had been doing this since the late 1960s before I joined as a junior engineer.
The point is that DSP existed well before the personal computer and its predecessors like the Apple ][.

What you think is mild criticism was questioning my proffesional (sic) integrity

Not at all. I asked in what way the fact that you were doing DSP on your Apple ][ in the 1970s was relevant to the current discussion. Which is a legitimate question given that DSP 1. was absolutely nothing new at that time, and 2. is as a engineering discipline quite different from AI.

Still waiting for you answer.

As to your ‘challenge’, I’m not going to play silly games with you.

FA • June 14, 2023 5:59 AM

@just some dsp engineer

And I don’t think I’m the only one having noticed that.

You are not, and it’s a recurrent problem on this blog.

Every time a new poster dares to disagree with Clive (which probably means that person has some competence on the matter being discussed), he/she will get this sort of vitriolic ‘character assassination’ type of response.

If you browse a bit in the monthly archives, you’ll find many examples.

The net result is that people who probably could contribute valid comments are just scared away. Since you very clearly know what you are talking about, I hope you will stay.

Why Mr. Schneier tolerates such behaviour in his ‘virtual living room’ is not clear.

Clive Robinson • June 14, 2023 9:20 AM

@ FA,

I wondered how long it would be before you appeared…

“If you browse a bit in the monthly archives, you’ll find many examples.”

Be carefull what you wish for…

Because people could easily save themselves a lot of search effort by looking for postings under your handle, and then looking for the preceding “new bottles with sour wine” in them…

The thing is analysis of the wine gives a probability that “Sock Puppetry” is involved. Which brings us to,

“Why Mr. Schneier tolerates such behaviour…”

The last time I looked for entries on sock puppetry in his guidelines he was against it (you will see from the time ordered comments on that page that the guidlines evolve with time).

But if people search back enough using the various online archives they will find that our host @Bruce has been frequently targeted in this way as have several others and that he has had to “clean house” a thankless and tedious task.

But have you actually “looked up” and considered the fact that @Bruce now rarely makes personal posts in the comments sections? And that may well be as a result of this sort of attack behaviour on him?

I suspect not, otherwise your behaviour would be different, as most can probably work out.

But whilst people are at your suggestion searching, they will find that,

“Every time a new poster dares to disagree with Clive (which probably means that person has some competence on the matter being discussed), he/she will get this sort of vitriolic ‘character assassination’ type of response.”

Actually they will find it’s not as you try to portray in your oft repeated way. I do welcome genuinely new posters and if they have something interesting to say I will encourage dialogue on it. You will find I even encorage some who are known to be or suspected to be using multiple handles, as even the social awkward / non typical and those starting out have usefull and interesting things to say, even when they think they’ve previously “scored an own goal”.

The thing about those that are not genuine new posters is that they all to frequently are “creatures of habit” and that gives “tells”. Worse as they see themselves as “being on a mission from… “, –which lets be honest is not a good place to start from– they try to make their faux-entrance “noticable” as what they consider,

“Their deads, must be highlighted for all to see how heroic they are…”

Even the fictional Don Quixote was not that immodest. So why they just don’t do “The DC Comics thing” and don a spandex suit, put their under pants on the outside adopt a silly pose and trot out a trite catch phrase, I guess is because this blog does not alow “picture posts”.

Maybe when Hollywood et al US pulp Entertainment gets over it’s current “DC Comics kick” and similar those “Wanabe’s” will change… Who know’s it might have other positive social benifits as well. Maybe if musicals come back people will “sing in the rain” rather than argue vermantly over who the Yellow Cab is for.

modem phonemes • June 14, 2023 10:43 AM

In the next, soon to be released, iteration of the Jurassic Park movies, The Tithonian Extinction, a stegosaurus, the personal stenographer for the global Mamenchisaurus dictator, uses steganography to hide in rhe fossil record messages to boreosphenidan agents seeking to bring about the Cretaceous Revolution.

ResearcherZero • June 14, 2023 3:49 PM

What we understand is a drop, and what we don’t understand is an ocean.

Sound engineering is quite a complex area of study. Some areas are subject to intellectual property, other areas too complex to explain to the average person. Math is a real pain to explain to people with a rudimentary understanding.

The physics of sound waves is much more difficult than what some might imagine. There is still a lot to learn.

http://hyperphysics.phy-astr.gsu.edu/hbase/Sound/reflec.html

https://www.acs.psu.edu/drussell/Demos/reflect/reflect.html

http://hyperphysics.phy-astr.gsu.edu/hbase/phyopt/interf.html#c2

https://www.physicsclassroom.com/Class/sound/u11l3d.cfm#diffrn

This is a little program you can run, but just for fun. It won’t teach you anything, probably.
https://milkbar-lads.itch.io/nonagon-infinity

Clive Robinson • June 14, 2023 4:56 PM

@ ResearcherZero,

“Sound engineering is quite a complex area of study. Some areas are subject to intellectual property, other areas too complex to explain to the average person.”

It’s funny you should mention this, as I’ve been asked to give an invited introductory talk next week about “Neural Networks as the future of Audio Processing”.

The big issue is not trying to explain away the complexity, but trying to make it not sound “duller than duck 5h1t” and have people snoring before the alotted two hours are up…

modem phonemes • June 14, 2023 7:29 PM

@ ResearcherZero

The physics of sound waves is much more difficult than what some might imagine. There is still a lot to learn.

After mastering this, then one can go on to the physics of elastic waves in heterogeneous media 😉 .

Duncan • June 15, 2023 1:58 AM

Clive Robinson: I’ve been asked to give an invited introductory talk next week about “Neural Networks as the future of Audio Processing”.

Anywhere in or near London ? I’m a (now retired) audio forensics consultant and would love to hear this.

Clive Robinson • June 15, 2023 7:10 PM

@ Duncan,

No it’s not in London and although I won’t need a passport, two nights accommodation is a required part of the plans.

It’s also one of those neither closed nor open events that are becoming more frequent these days, with evening social for the “special interest group” networking included…

As for “audio forensics” yes that does pop up in my wheel house, and applying AI to the subject is something that you might find worth comming out of retirment for…

As there is likely to be quite a bit of “venture money” floating around it in the very near future (in the same way Venture Capital has be found behind most of those “Surveillance startups”, “Blockchain Startups” and now increasingly “AI startups”…

I won’t say that they are a “match made in heaven” for technical people, but for those “building the bubble” as VC’s do, they are going to be what they will tout as a major investment opportunity…

My advice for those wanting to actually make money in investing in AI is still the same,

“Don’t invest in AI, you will loose your shirt, instead invest in those who make AI startups possible”.

Some times called “Secondary investment via the supply chain”. AI companies need GPU’s FPGA’s in CPU’s[1] and lots of high end memory silicon in very large quantaties to build those “sell the dream” demos on. Thus a big chunk of VC cash will end up in the pockets of shareholders of Nvidia etc (and I suspect Intel Managment have more than their fingers crossed as a “last gasp”). However those that “invest in the dream” will find their cash will be swallowed by that supply chain and the VC’s and those paid to “build up the dream” bubble.

As the old saying has it,

“You can be in the boat afoat and dry. Or in the water taking a cold bath and…”

[1] I was having a chat the other day with somebody in that part of the industry, and we were debating if other “Vector Processors” would come back, and what they might be rebranded as… As it appears the silicon makers are not “hopped up on the dream” and have their feet more firmly on the ground… As some will no doubt jump on their hobby horses if I go into details, I’ll let others explain 😉

https://www.synopsys.com/designware-ip/technical-bulletin/npus-dsps-for-ai.html

https://semiengineering.com/speeding-up-ai-with-vector-instructions/

ResearcherZero • June 15, 2023 11:44 PM

DSPs are used increasingly in many settings including imaging (radar, medical imaging etc). A very good understanding of math (computational number theory, trig…) and it’s proper application is important for developing algorithms as they need to be efficient. Each situation is different, so experience is important.

“the most efficient way to factor large numbers will vary depending on the specific application and the data set that is being used”

‘https://www.aiforanyone.org/glossary/computational-number-theory

Sampling might seem straightforward, but there is a lot to it.
‘https://pysdr.org/content/sampling.html

This is a good explanation:
https://electronics.stackexchange.com/questions/280427/when-is-it-required-or-permitted-to-sample-below-nyquist-rate

Since a digital filter uses a finite number of bits to represent signals and coefficients, we need structures which can somehow retain the target filter specifications even after quantizing the coefficients. In addition, sometimes we observe that a particular structure can dramatically reduce the computational complexity of the system.

“Filters are used in a wide variety of applications. Most of the time, the final goal of using a filter is to achieve a kind of frequency selectivity on the spectrum of the input signal.”
https://www.allaboutcircuits.com/technical-articles/finite-impulse-response-filter-design-by-windowing-part-i-concepts-and-rect/

‘http://www.ee.ic.ac.uk/hp/staff/dmb/courses/DSPDF/01000_Structures.pdf

…

Signal Processing

“In signal processing, the signals can be either analog or digital, while in DSP, the signals are always digital. This means that in digital signal processing, the input signals must first be converted into a digital format, which can then be manipulated using mathematical operations. DSP allows for more precise and flexible signal manipulation compared to signal processing, but it is limited to the processing of digital signals.”
https://www.electronicsforu.com/technology-trends/learn-electronics/signal-processing

One major drawback of analog signal processing is variation in the value of the electrical components.
https://www.allaboutcircuits.com/technical-articles/an-introduction-to-digital-signal-processing/

The digital processor receives the digitized signal as input, processes it, and stores it. During playback, the digital processor decodes the stored data.
https://www.softwaretestinghelp.com/digital-signal-processing-tutorial/

programming DSP processors is still done in assembly for the signal processing parts or, at best, by using assembly written libraries supplied by the manufacturer

A digital signal processor is actually a computing engine. This computing engine can be a general-purpose processor, an FPGA, or even a purpose-built DSP chip.
‘https://www.embedded.com/using-embedded-c-for-high-performance-dsp-programming/

…

https://towardsdatascience.com/machine-learning-and-signal-processing-103281d27c4b

IEEE Signal Processing Society

‘https://signalprocessingsociety.org/

AI-Generated Steganography

Comments

Leave a comment Cancel reply