Hard-Drive Steganography through Fragmentation

Khan and his colleagues have written software that ensures clusters of a file, rather than being positioned at the whim of the disc drive controller chip, as is usually the case, are positioned according to a code. All the person at the other end needs to know is which file’s cluster positions have been encoded.

The code depends on whether sequential clusters in a file are situated adjacent to each other on the hard disc or not. If they are adjacent, this corresponds to a binary 1 in the secret message.

Paper.

Tags: hardware, steganography

Posted on April 25, 2011 at 5:24 AM • 38 Comments

Comments

Darien Kindlund • April 25, 2011 6:03 AM

Interesting. So to counter, you simply defragment the hard drive to wipe out the hidden message.

anonymous • April 25, 2011 6:20 AM

This technique isn’t new.

Matt Gillard • April 25, 2011 6:21 AM

That is assuming you know the hdd is encrypted! That’s the beauty of this method.

David Frier • April 25, 2011 6:25 AM

As Olin Sibert would say, you can ALWAYS send one bit at a time. The job for the defender is to make the outbound bit-rate too low or noisy to be of enough value.

By that standard, this trick defends against itself. 🙂

askme233 • April 25, 2011 6:57 AM

So is there a similar approach to network communication that uses packet fragmentation patterns to encrypt/hide a message in a data stream?

That would defeat David’s issue as it is easier to transmit and can deliver mich higher info for volume.

Of course, you could just encrypt the data in the packets (or the harddrive) but where is the movie plot fun in that?

ASK

askme233 • April 25, 2011 7:09 AM

OK. Way to read before posting…

” data intended to be secret is added to the pixels in digital images, or used to change the transmission timing of internet packets”

pooja singh • April 25, 2011 7:48 AM

can u please give me the information about Steganography?

JT • April 25, 2011 8:23 AM

Foreign intelligence service much, pooja?

Spider • April 25, 2011 8:52 AM

Yeah… I’ve done that before. We were really paranoid, or maybe just bored from lack of other interesting work. There are some non consumer devices out there that have this already set up. But I doubt anyone really cares. They weren’t that successful. There is no one trying to crack them.

a different phil • April 25, 2011 9:03 AM

Pooja, try Wikipedia.

Andrew2 • April 25, 2011 9:26 AM

Uhm… doesn’t the operating system usually decide which cluster of a file goes where, not the disk controller?

Also, unless the bit rate were very very low, statistical analysis of the fragmentation rate of the drive would likely reveal the presence of a hidden channel. Assuming of course that the attacker knows enough to check.

kashmarek • April 25, 2011 10:05 AM

Puleeze…lets show some reality here!

Rick Auricchio • April 25, 2011 11:08 AM

This sounds similar to NRZI encoding of data bits magtape or floppy disc. A flip of the magnetic N-S pole is a 1, whereas no change is a zero.

In the fragmentation example, a fragmented block is a 1 and a contiguous one is a zero.

Nonetheless, it’s an interesting way to encode a data stream, even if you’d be lucky to encode the 272-word Gettysburg Address that way.

RH • April 25, 2011 11:09 AM

20Mb in 160Gb. Assuming sector sizes of 4kB (new drives), thats 40M sectors.

He claims it “looks just like normal usage,” but I find it amazing that he can make things look normal with that kind of ratio of data to available space for data.

I’m much more intrigued by the stenography into JPEGs, because its so hard to detect the message.

Rick Auricchio • April 25, 2011 11:09 AM

That should have read “encoding of data bits ON magtape…”

Also, a contiguous [block] constitutes a zero.

dmitryk • April 25, 2011 11:13 AM

This method could be detectable based on common use scenarios of the different type of media. The sample of USB stick used will reveal covert channel if one deduct/assume use of that stick for archiving purpose – storage of typical non-frequent overwritten data, which will be sequentially located with no fragmentation in place. Other potential detection could be based on standard OS/Filesystem behavior which is predictable in a way it fragment data on the disk.

143k • April 25, 2011 11:35 AM

Why not just write data into the unallocated sectors on the disk?

Since they don’t map to a file in the FAT table, nobody would know to look for them.

The person on the receiving end would just need to know what sectors to read to obtain the message.

Of course, any whole-disk-scanning “forensic” software may detect key words in the message, if that message is stored in plain-text.

EH • April 25, 2011 1:33 PM

Uh, all you “security” people replying to pooja should realize that comment is linkspam.

aikimark • April 25, 2011 1:48 PM

Reminds me of the bilateral cipher that you wrote about in March:

http://www.schneier.com/blog/archives/2011/03/biliteral_ciphe.html

Nick P • April 25, 2011 4:04 PM

This covert channel seems really small in data storage and would take forever to read. The use of easily hidden USB flash drives, memory cards, & optical media makes more sense most of the time. Not to mention Feds might start looking for this now, especially as tools like EnCase could do with with little modification.

Clive Robinson • April 25, 2011 4:49 PM

One way at looking at it is stego but that is not realy what is going on.

As described the moving around of the blocks is effectivly a variation of a transposition cipher.

That is there is the nominal position of blocks based on a lot of meta data and the actual position of the blocks on the disk.

I did some work back in the late 90’s about predicting the position of file blocks to detect erased files by the holes they leave (like foot prints in mud that get fossilized in place)

Now I’d have to have a longish think on it but I suspect it will fail to a couple of things, the first being that the data alignment does not match the file meta data and further “contact analysis” of the blocks.

Also single user machines tend to have charecteristic files and sizes bassed on what the user does with the machine, the faux missing files would still need to match this user profile

tommy • April 25, 2011 5:29 PM

@ Ed: You beat me to it, but glad somebody finally posted about Pooja’s linkspam.

MODERATOR: Please review post by pooja singh at April 25, 2011 7:48 AM, for the link in the signature. The question in the post is absurd for any reader here.

@ ALL: So, now you have to give or mail your contact an entire hard drive? Or leave it in a dead drop? No, that‘s not going to look suspicious at all …

Idea: Maybe a full-disk-image backup, per, say, Acronis (no, not spamming for them; there are others), which could be emailed, probably arousing less suspicion. Recipient paints it on a blank or unneeded HDD and uses this method of steg decrypt.

Plausible deny: “My friend’s HD died, and we use the same OS, programs, etc.”. Would need to use same or very similar brand/model of computer, and have lots of innocuous info on the disk. They recommend almost-full.

Hmm… (as CR would say 😉 OR— even better, and less suspicious — send your friend your “art gallery collection” HD (for external mount) – takes up lots of space, especially if videos, which meet the demand for large numbers of large blocks, hence likely fragmentation. Totally deniable. And the investigators will be too “distracted” to pay much attention to their forensics duties.

(though i still think the image-steg is cooler, esp. if done on whatever level of “art” is permitted in the locales involved.)

Moderator • April 25, 2011 7:06 PM

We do get authentic newbie questions here, often from people who’ve Googled some technical term. They’re not automatically suspicious. The blank blog template that they linked to could be a placeholder for something nefarious or could just as easily mean they haven’t finished setting up their blog yet.

On the other hand, an empty blog is no use to anyone even if it’s legit, so on second thought I will strip out the link.

tommy • April 25, 2011 11:02 PM

@ Moderator:

I understand, but one would think that Wikipedia would be the first place to look in such a case. In fact, I just Googled “steganography”, and WP came up first; a search for “schneier” among the first 100 results went dead (red in the “find” box) when I got to “schn”.

Using a keyword from any forum thread or blog post is a common technique among spammers these days, including automated ones. As noted, it did get past a number of eyes.

I believe the original link was an attempt to sell hosting to potential bloggers, though of course I could be mistaken. Thanks for your time.

Neil • April 25, 2011 11:51 PM

Bruce, I’d be interested to know what you think is “clever” about this particular side channel vs other side channels?

Nick P • April 26, 2011 1:17 AM

@ Neil

“Bruce, I’d be interested to know what you think is clever about this particular side channel vs other side channels?”

I second that. I find that timing channels are the hardest to identify and implement. A good timing channel can allow for a tremendous amounts of information to leak, as the cache attacks on processors and HyperThreading show. They are also the hardest to observe and plug without severely impacting performance.

On a side note, I find effective strategies for preventing or nullifying covert channels to be more clever than the identification of any particular channel. It’s an extremely difficult problem to solve. Imho, modern OS’ designs provide a resilient, distributed sea of covert channels. With even A1 class systems having residual channels, I’m not surprised to see a steady stream of them.

Alfred Neuman • April 26, 2011 1:26 AM

I’m an academic and part of my daily work consists of reviewing journal and conference papers. We get an ongoing stream of papers from India and Pakistan with exquisitely homebrew, undergrad-student-level stego techniques that usually correspond to ideas others rejected a decade ago because they’re trivially detected and defeated. This looks like yet another one of those.

I’ve posted this anonymously because I’m certain there’ll now be comments about racist Americans, even though all I’ve done is report an unusual statistical pattern of papers received for review.

AC2 • April 26, 2011 1:58 AM

How on earth is this supposed to work?

A. Mr Mole gets his hands on some confidential data in electronic form on a computer under his control (presumably in a sealed room as in Mission Impossible, because if it was on a laptop we wouldn’t be discussing this)

B. From there he encodes this data onto an portable hard disk connected to said computer using this technique. (How he gets this ingenenious piece of software onto said computer is left unsaid)

C. Then he hands over the said portable hard disk to the enemy, presumably after passing a security check (and not via the convenient cooling duct after evading the lasers) on the drive that concludes ‘yup nothing important here’

D. Enemy decodes the information from the portable hard disk, presumably now well outside the ability of security team to make any further checks

What rubbish…

From point B onwards he might as well copy all the info onto a micro-SD card using a reader connected to the accomodating computer he has, shove it (the micro-SD card I mean) up his a** and get on…

Any org that lets people connect portable hard drives to computers containing confidential info and cart said drives past a security check is ‘gonna get GOT’, with or without this…

And anyone doing this is already way past any ‘plausible deniability’…

asd • April 26, 2011 3:35 AM

The data could already be on the portable hdd, all the program would need to do is find multables of 1byte(or something)*size and then change the key to map to the places on the hdd

Paeniteo • April 26, 2011 4:16 AM

@askme233: “So is there a similar approach to network communication that uses packet fragmentation patterns to encrypt/hide a message in a data stream?”

In this case you would have to deal with (possibly significant) noise, as the transmission of the data stream can change the fragmentation pattern.
Also, I believe that it would be rather easy to detect “unusual” fragmentation patterns (but IANA-network-engineer).

NB: “easy to detect” strongly depends on your exact threat scenario. One should define that first and then construct adequate defences (or, rather, only with an a-priori defined threat scenario are you able to judge about the adequacy of your measures).

Gianluca Ghettini • April 26, 2011 4:43 AM

naaaaa, to me this technique is too much unreliable… just wipe out the hd unused space with random data (that’s a common practice allowing for plausible deniability), then superimpose an encrypted data block over the random data (take note of the start offset). If the encoding is good nobody can tell apart random data from the encrypted block. That allow for good steganography.

Gianluca Ghettini • April 26, 2011 4:45 AM

just another thought….

steganography can be implemented hiding the data OR taking advantage of plausible deniability…

Ray Foo • April 26, 2011 11:49 PM

Sound like yet another application of Bacon’s cipher 🙂

http://en.wikipedia.org/wiki/Bacon's_cipher

Michael S. Gordon • April 27, 2011 12:12 AM

I think this would be too unreliable. I personally would go with audio since it is a very inaccurate form of data. Switching a few bits here and there would not be picked up the listener but someone that knows the message is in the audio file can with the right key can decode it with no one being the wiser..

RonK • April 27, 2011 7:35 AM

Somewhat reminiscent of the storage of info via selecting the permutation of colors in a GIF image header (which has been known for ages).

keith • April 27, 2011 8:49 AM

As a message drop, that is easily trashed, its interesting – but it’s portabilty per message is limited (posting drives?). steganography in photos works because you can hid the message containing image anywhere.

To have any level of portabilty you’d have to use Virtual Drives. would the technique still apply?

ichinin • April 27, 2011 1:52 PM

1: Statistical comparison of the fragmentation level vs fragmentation in other areas.

2: Wear-leveling in Solid State Drives.

This is a really crappy idea.

leuk_he • April 28, 2011 9:28 AM

@ichinin
1 If you read the paper they claim to have made a study of typical fragmentation. So a simple statistically analysis would not work.

2 wearleveling in ssd drives makes no difference, for the OS a SSD still looks like a block device. The SSD firmware hides the fragmentation caused by wear-leveling.

however hiding 0,0125% of effective data is not a impressive feat, embedding in noisy jpeg file give amuch higher storage ratio.

Schneier on Security

Hard-Drive Steganography through Fragmentation

Comments

Leave a comment Cancel reply