Poor Man's Steganography

Hide files inside pdf documents: “embed a file in a PDF document and corrupt the reference, thereby effectively making the embedded file invisible to the PDF reader.”

Tags: Adobe, concealment, steganography

Posted on July 14, 2009 at 1:48 PM • 24 Comments

Comments

pegr • July 14, 2009 2:28 PM

I bet the header is still in the clear in the PDF. 2/10, fools dummies only. (Making it 97% effective!)

RH • July 14, 2009 2:29 PM

This doesn’t seem like all that big of a deal. Many many file formats allow for silent handling of ‘garbage’.

I’m working with Microsoft’s PE format (their executable format for C#) right now: it explicitly allows for ‘garbage’ in the blob heap as long as it is not addressable within the PE table framework. I believe word documents are notorious for having garbage in them as well.

How is this any more “sneaky” than commenting out text in a HTML file?

Now I could see this being useful if there was a way to malform the embedded file so that its not readable by humans, but search engines pick it up. You could tell if someone is distributing your work on the internet without concent using a quick google search.

Didier Stevens • July 14, 2009 2:37 PM

@RH

Now I could see this being useful if there was a way to malform the embedded file so that its not readable by humans, but search engines pick it up.
You could tell if someone is distributing your work on the internet without concent using a quick google search.

Yes, I’ve a trick to achieve this. But that’s for another post.

PP Kozon • July 14, 2009 2:40 PM

Is there such thing as rich man’s steganography?

Matt Simmons • July 14, 2009 2:46 PM

Steganography is one of those things that no one will ever be able to prevent. So long as one human can change something, someone else can come along and interpret that change into meaning.

Unless the Matrix runs chrooted, of course.

Didier Stevens • July 14, 2009 2:47 PM

How is this any more “sneaky” than commenting out text in a HTML file?

http://blog.didierstevens.com/2008/03/31/hiding-inside-wikipedia/

Tim • July 14, 2009 3:48 PM

Yeah nothing interesting. It’s clear to anyone who reads the spec that you can create PDF objects that aren’t linked to from anywhere. No ‘corruption’ involved…

Tim • July 14, 2009 3:53 PM

This page gives a good idea of how easy it is:

http://www.gnupdf.org/Introduction_to_PDF

Read the first example and understand the links. Now imagine you add an extra object:

6 0 obj
<<
…hidden file here…

>
endobj

and update the xref table. Job done.

Didier Stevens • July 14, 2009 4:07 PM

and update the xref table. Job done.
And also update the trailer. Don’t want malformed PDF.

aikimark • July 14, 2009 4:11 PM

@Didier

Nice and thanks.

However, I’m not sure I would call this trick steganographic in nature. The imbedded file’s data is much too contiguous (IMHO).

Didier Stevens • July 14, 2009 4:23 PM

@aikimark

However, I’m not sure I would call this trick steganographic in nature. The imbedded file’s data is much too contiguous (IMHO).

Agree, but had no better word for it.

FYI: this has been used during pentests (even without the stego option) to pass files through AV and IDS.

Alice Bevan-McGregor • July 14, 2009 5:20 PM

In other steganography news, hid.im hides Bittorrent downloads in PNGs:

http://torrentfreak.com/hidim-converts-torrents-into-png-images-090714/

Anon • July 14, 2009 5:43 PM

Obfuscation is easy. If there’s a lesson here, it isn’t that the PDF format is a great way to obfuscate, but that data can always be “smuggled” in spite of automated filters — that there’s an essentially infinite number of ways to package it.

Makes me wonder why we haven’t done better at making it trivial to get around the Great Firewall of China — it’s allegedly a simple (if huge) filter, the tools to work around it (plentiful bandwidth, crypto, etc.) are mature and widely available, and this should be the sort of cause crypto-libertarians are all about. If we can’t block spam, how can China block discussion of democracy? 🙂

Semi-relatedly, if you’re a company that wants to make a leaked document or program or something disappear, you may be pretty fucked. Storage is so cheap, and so many companies are giving it away free, that a moderately clever Internet user could get copies of a file posted in a dozen places on the Internet in an hour or two, including peer-to-peer networks and sites like Cryptome and Wikileaks that aren’t wimpy about taking things offline.

09 F9,
Anon

Anon • July 14, 2009 7:50 PM

I know a far more pathetic example. At my work, .exe files are automatically deleted from emails and cannot be downloaded.

If, however, you rename such a file .txt it will go through just fine, and run as long as it doesn’t require administrator privileges. That is good enough to install Firefox, etc.

Benson • July 14, 2009 8:12 PM

I’d like to point out the “GIFAR” concept published at last year’s blackhat. Essentially, gif and jpeg files store the index at the top, zip files store it at the end. You can concatenate an image and a zip and the resultant file will be readable as both a zip and an image. Incidentally, a java “jar” is just a zip, so you can hide an entire java program in an “image file”. As such, pretty much any data you’d like to share covertly could probably be uploaded to something like imageshack. I think the PDF vuln described here is just a variation on the GIFAR theme.

The Imp • July 14, 2009 8:26 PM

However, I’m not sure I would call this trick steganographic in nature. The imbedded file’s data is much too contiguous (IMHO).

Cryptography is making it unreadable, unless you know the key.

Steganography is making it invisible, unless you know where to look.

Both have advantages and disadvantages, and together they’re even better (encrypted data is secure, but suspicious; obfuscated data seems benign, but is insecure; together it’s secure and looks harmless). But, any detail beyond the above is academic really, and neither is without a weakness. The point that some people might think to look is kind of like the fact that people can brute-force an otherwise cryptographically secure key; that doesn’t disqualify the definition, depending only on just how likely that possibility is.

DeCSS • July 14, 2009 8:47 PM

@ Benson

Wow, that’s a blast from the past!

I remember that being one of the ways demonstrated for distributing the DeCSS source code, way back in 2001 or so. And I’d bet that the general technique has been around for much, much longer.

http://www.cs.cmu.edu/~dst/DeCSS/Gallery/Stego/index.html

Michael • July 15, 2009 2:20 AM

Easier way – zip a file and rename the .zip to a .doc. Put it in your ‘my documents file’. The size will not be abnormal and unless anyone is looking for a specific date, they will only find a corrupt .doc file. You can simply rename it to a .zip at your leisure and retrieve it.

Stringer Bell • July 15, 2009 4:23 AM

Recent MS office/open office doc files are just zip files with a different extension. You can place files inside the zip file structure by dragging the doc into winzip.

My first bit of sort-of-stenography was to write a program which stored a message in a BMP file. It made use of the extra byte on the end of odd width pixel rows.
e.g if you have a 31 pixel wide bitmap then due to rounding rows to word boundaries, you have one byte per row to set to whatever value you want (it will be ignored by image viewers).

Did Not • July 15, 2009 6:46 AM

What if the next version of PDF has case insensitive file names? Suddenly the hidden file is in the clear. This is not much better than using lemon juice as invisible ink, or using a simple substitution cipher.

Pete Austin • July 15, 2009 7:03 AM

This would work with any XML-based document format, such as ODF or OOXML, and could be built into a Word Processor or other editor.

Here’s hoping no troll has patented this obvious idea yet. If not, I claim prior art to stop them.

Didier Stevens • July 15, 2009 8:04 AM

What if the next version of PDF has case insensitive file names?

It’s not the case sensitivity of the file names, but the case sensitivity of the PDF language.

Kyle Wilson • July 15, 2009 10:26 AM

The more interesting approach would be to encrypt the data before inserting it. Assuming that you insert only the ‘raw’ encrypted data (no headers or other recognizable bits) it should be very hard to tell whether this is simply a PDF with a corrupted/junk area or a PDF that is hiding useful data. I suspect that there are plenty of buggy PDF creator tools out there that might include ‘junk’ sections in final file (say regions of arbtrary process memory that got left in). Trying to prove that a blockof apparrently random junk in the file is actually and encrypted data block could be difficult.

Ven'Tatsu • July 16, 2009 3:26 PM

This idea has been taken even farther with the experimental Perl module Fuse::PDF ( http://search.cpan.org/~cdolan/Fuse-PDF-0.09/ ) which lets you mount a PDF file as a user space file system under some operating systems.

Poor Man's Steganography

Comments

Leave a comment Cancel reply