LLMs and Text-in-Text Steganography
Turns out that LLMs are really good at hiding text messages in other text messages.
Turns out that LLMs are really good at hiding text messages in other text messages.
Derek Jones • May 11, 2026 8:48 AM
One of my attempts to shroud human detectable meaning from LLMs was to make phonological changes to words. I was expecting word tokenizations to make it difficult for LLMs to decode sentences such as the following:
“phashyon es cycklyq. chuyldren donth wanth tew weywr chloths vat there pairent weywr. pwroggwrammyng languij phashyon hash phricksionz vat inycially inqloob impleementaision suppoort, lybrareyz (whych sloa doun adopsion, ant wunsh establysht jobz ol avaylable too suppourt ecksysting kowd (slowyng doun va demighz ov a langguij).”
In practice even small 4 billion parameter models handle these changes with ease.
Clive Robinson • May 11, 2026 9:45 AM
@ ALL,
Neither the idea or the general method are new.
I’ve been talking about this off and on on this blog for quite some time.
The real issue is at what layer of language you are going to have the stegonography work at.
The higher the layer –in effect the more token length– the more coherent word for word the resulting stego-text is, but the more it is going to read badly due to jumps in context or similar.
As for the paper, unless you are really keen, it is not well written thus…
Gheese • May 11, 2026 9:46 AM
@Privacy or as it happened with the Epstein files, try black font on black background. To be fair, that’s censorship, not steganography, but both have the same bypass.
Jonathan • May 11, 2026 10:08 AM
Almost a certainty that there’s a message encoded in that abstract, but you’d need to read the article to decode it.
taters • May 11, 2026 10:39 AM
To hide text, try white text on a white background
And for TEMPEST you want a special font on a dark grey background. There was a piece of software for Windows which was an anti-TEMPEST notepad but I forget the name of it.
@ taters,
It’s “Zero Emission Pad” that does anti-TEMPEST font smoothing, but it still gets (key)logged if you have a keylogger on the system.
It’s an old free program for Windows. If you find it, consider uploading it to archive.org unless they already have it. It was a rare piece of software which disappeared quite quickly from most of the web. I haven’t searched for it in years. I may have it on a backup somewhere.
a slighly related tool available in debian i noticed a couple weeks back:
snowdrop
it can watermark plaintext english, the c source code enabled branch is labeled experimental.
Clive Robinson • May 11, 2026 3:24 PM
@ taters, %, ALL,
With regards “Soft Tempest Fonts”.
The original work was done at the UK’s Cambridge Computer Labs run by Prof J. Anderson by the researcher Markus G. Kuhn.
He released it via the labs blog “lightbluetouchpaper.org” where he and I had various concersations relating to TEMPEST and Van Eck Freaking that was the cause of concern at the time (and still is).
Unfortunately the equipment available to the lab at the time is not what we would consider “top of the line” or even upto what a home hobbyist[1] can buy online for about half the price of an upper end mobile phone.
These “Software Defined Radios”(SDRs) have made significant changes to what can be done. Worse from a defenders point of view the software is vastly improved and the likes of GNU Radio that enable you to define your radio parts/chain have also significantly benefited not just from the increased CPU power, but also the much wider I/Q bandwidths the likes of FPGA’s etc have given.
Thus TEMPEST / EmSec has vastly improved beyond what Van Eck Freaking used to give.
The result is that the original Soft Tempest Fonts don’t give you much these days. Whilst they can be improved with Spread Spectrum techniques the gain is little.
Any way you can read more at,
https://www.cl.cam.ac.uk/~mgk25/emsec/softtempest-faq.html
[1] As I’ve indicated before Oona Räisänen (Windytan) used to occasionally put up work she has been doing in the EmSec sense on her blog “Absorptions”, that covered aspects of using SDR’s etc,
However it’s been a year since her last post. You can find other sites that get more regularly posted to but I would advise caution as some are not sites you’d want to visit for various reasons.
duck • May 11, 2026 7:12 PM
@ Clive,
Windytan is/should still be on… Slashnet IRC I believe? In one of the more populated channels.. She responded to me once upon a time when I had some SDR questions.. Nice gal.
I recommend everyone try out the free/open source program:
It works on modern day monitors and demonstrates just how insecure our devices are. No special hardware needed! Your monitor will do the broadcasting to local AM/FM radio!
Now advanced users will appreciate using a SDR with programs like TempestSDR.
Excuse me there’s a knock at my door…
Clive Robinson • May 12, 2026 12:37 AM
@ Duck,
With regards Windy Tan, she has a Youtube account which has a link of to a Mastodon account that has recent posts,
https://mastodon.social/@windytan
Scanning down you can see she sometimes she does what looks like “odd things” to others who dabble in electronics. This caught my eye just now,
https://mastodon.social/@windytan/112056020157051841
Because it is sort of a variation of an EmSec issue I mentioned to @mapq just a couple of days back and @figueritout several years back.
Then there is her software stuff she has added to GitHub,
https://github.com/windytan?tab=repositories
She also is a licensed Ham but she appears to keep her Ham and Radio Hacker stuff apart.
MrC • May 12, 2026 4:01 AM
It’s interesting how the first half of the paper describes a rather clever steganography method, while the second half reads like someone having a mental breakdown upon realizing that LLMs really are nothing more than overgrown autocomplete, without a whiff of intelligence, knowledge, or intent.
Clive Robinson • May 12, 2026 6:46 AM
@ MrC, ALL,
With regards your observation of,
“… while the second half reads like someone having a mental breakdown upon realizing that LLMs really are nothing more than overgrown autocomplete, without a whiff of intelligence, knowledge, or intent.”
The evidence for the realisation is growing.
The developer of cURL, Daniel Stenberg who fairly publicly lambasted AI Slop bug reports etc has come out against Anthropic’s Mythos “hype”,
You can hide data in vector databases/embedding pipelines using some interesting steganography:
https://vectorsmuggle.org
Subscribe to comments on this entry
Sidebar photo of Bruce Schneier by Joe MacInnis.
Privacy • May 11, 2026 8:07 AM
To hide text, try white text on a white background. The human eye won’t see it but the computer will. If you want to test (your machine) not in the wild, try the command line to reformat the hard drive.