Bruce Schneier | |||||||||||||||
Schneier on SecurityA blog covering security and security technology. « Risks of Losing Portable Devices | Main | For-Profit Botnet » February 1, 2006The NSA on How to RedactInteresting paper. Both the Microsoft Word document format (MS Word) and Adobe Portable Document (PDF) are complex, sophisticated computer data formats. They can contain many kinds of information such as text, graphics, tables, images, meta-data, and more all mixed together. The complexity makes them potential vehicles for exposing information unintentionally, especially when downgrading or sanitizing classified materials. Although the focus is on MS Word, the general guidance applies to other word processors and office tools, such as WordPerfect, PowerPoint, Excel, Star Office, etc. EDITED TO ADD (2/1): The NSA page for the redaction document, and other "Security Configuration Guides," is here. Posted on February 1, 2006 at 1:09 PM • 21 Comments • View Blog Reactions To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter. I'm amused that the final major step is to convert to .pdf. Apparently, the NSA considers the .doc format non-trivial to redact. Posted by: Fred Page at February 1, 2006 1:45 PM Interesting that they suggest replacing written content with a series of a single letter. I remember reading a paper from Daniel Lopresti and A. Lawrence Spitz where they showed that often times you can recover the original word if you know the size of the redacted word and its context. Posted by: Eric Miller at February 1, 2006 1:52 PM Amusing trivia : length of download document in KB = 666 ;-) Posted by: Stu Savory at February 1, 2006 1:52 PM It cracks me up that in such a serious document the screenshots of Word have the little kitty cat :D @eric: hurm... I was think that. At least it's not a case of just replacing the letters with a fixed char, but preserving the spaces. I suppose it just depends on how much is being redacted. The larger the block the harder it is to work that out. For a genuine old skool redacted look you could replace the number of deleted chars (including spaces) with the same number of chars of lorem ipsum then make the background and text black. Posted by: Sabre150 at February 1, 2006 2:19 PM For what it's worth, the NSA page for the redaction document and other "Security Configuration Guides" is at http://www.nsa.gov/snac/ (The link Bruce gave for the document is at fas.org. No problem per se. Many people may be more comfortable going to the FAS site instead of NSA's. But sometimes it is good to know where the author offers a document.) Posted by: J.D. Abolins at February 1, 2006 3:11 PM @Fred: Read the paper - under "Details", it says exactly why they start with Word (because everyone uses it) and end with PDF (because it's the de-facto standard for distributing read-only forms of a document). For redacting Word documents on their own, there's always the redaction tool that Microsoft post at http://www.microsoft.com/downloads/details.aspx?... - I haven't tried it myself, mind you. Posted by: Alun Jones at February 1, 2006 3:13 PM I guess this is easier than erasing/hiding the secret parts, printing out a hardcopy, reviewing it, and scanning the hardcopy back into a new file. Posted by: John at February 1, 2006 4:38 PM How long until it's modified to say "In order to redact a document, one should first , and then . Finally, an should be applied..." Posted by: Nick Johnson at February 1, 2006 4:44 PM Gah! Those weren't real HTML tags, blog software! I guess my mock redaction was a bit more realistic than I was trying for. ;) Posted by: Nick Johnson at February 1, 2006 4:45 PM @Alun Jones Thanks; I think I was reading too much into the first sentance in section 6. Posted by: Fred Page at February 1, 2006 5:42 PM (As demonstrated by page two of the document, adding a tantalizing line saying "This page intentionally left blank" is just bound to keep people guessing at to what you've hidden under all that whitespace...) Posted by: mpg at February 1, 2006 6:50 PM Poor strategy. Why not to print the result to JPG or TIF files? These formats will simply make the "black rentangle" and "white rectangle" trick, which are for hard copy, to work flawlessly. (Of course, care should still be taken for metadata) Posted by: WC Leung at February 1, 2006 9:38 PM @Someone: Because of the ratio of users-that-use-MS-Word / users-that-use-*Tex. This is not an NSA employee manual. It's a guide for the rest of us. Posted by: Dimitris Andrakakis at February 2, 2006 3:07 AM Use Acrobat to save each page to TIF, convert to CCITT Group 4 Fax format (uses the least amount of space), remove metadata from resulting TIF, collate back into Acrobat and finally scramble metadata and date in PDF using a good text editor. If you need stgrong anonymity, print to paper, then scan and post at an interent cafe. Do any of you guys know if Adobe generates any kind of information in the PDF file that links back to your software serial or some kind of hardwar indentifier? Posted by: chill at February 2, 2006 5:23 AM Use Acrobat to save each page to TIF, convert to CCITT Group 4 Fax (uses least space), remove metadata from resulting TIF, collate back into Acrobat and finally scramble metadata and date in PDF using a good text editor. If you need strong anonymity, print to paper, then scan and post at an internet cafe. Do any of you guys know if Adobe includes any kind of information in the PDF file that links back to your software serial or some kind of hardware identifyer. Posted by: Chill at February 2, 2006 5:32 AM I am currently examining digital redaction methods for use in government archives. The print to PDF method is widely reccomended. The reason why is because a PDF printed in this way will not contain any hidden metadata which may exist in the original format. In fact, an existing PDF would also be converted into another PDF too. It is the conversion process itself which (apparently) ensures that non-visible material is excluded. More extreme methods involve printing to PDF, then OCRing the PDF to make a new document (PDF or otherwise), to ensure complete isolation from anything that is not visible. Reasons why documents are not simply converted to an image file format are: (1) Most image file formats can contain hidden metadata (simple bitmaps don't. JPEGs and TIFFs do). You must either verify or trust that a tool converting a document to a bitmap is not also helpfully preserving hidden metadata (like the author field, for example). Posted by: Matt Palmer at February 2, 2006 6:29 AM Alun Jones:"For redacting Word documents on their own, there's always the redaction tool that Microsoft post at http://www.microsoft.com/downloads/details.aspx?... - I haven't tried it myself, mind you." This tool does not provide a complete secure redaction solution in the same sense as the NSA method, although it is useful. This tool is about the actual act of redacting material itself, not securing the redacted document once done. It can only perform textual redaction (graphics and other objects cannot be redacted by the method). It works by replacing characters with a roughly equivalent length of short characters - the pipe symbol |, which are formatted to appear as black on black. This is good, because it largely preserves the formatting of the document, but is not guaranteed to be *exactly* the same length as the original word(s), and the number of characters will be different too. This helps to foil various kinds of attacks on the redacted material, including guessing words by their exact positioning and length. However, this "redacted" word document may still contain all sorts of other hidden metadata. To ensure you have got rid of this, the convert-to-PDF, print and OCR, or other method is used to ensure that the document contains only what is currently visible. This is securing the redaction. Posted by: Anonymous at February 2, 2006 6:39 AM I went looking for easter eggs. There appears to be one item "redacted" from the original document. Perhaps I just missed it, but there's a non-visible "CLASSIFICATION//X1" tag line in there. Yeah, it's timid, but a little ironic. Posted by: Redacted at February 2, 2006 8:58 AM One reason NSA doesn't print and scan is that it's under a statutory mandate to promote accessibility for the handicapped, as are other government agencies. Printing and scanning gives you image files that can't be accessed by reader software, whereas files converted to PDF directly from Word are accessible. Printing a visually redacted document and then scanning is easily the most secure way of producing a redacted document, though, and it's relatively easy. I recently did just that with a 50-page document that needed to be filed with a government agency in both confidential and redacted form. I used a style in Word that I named "confidential", and in the full confidential version I shaded the style with a bit of gray. In the version used for redaction, this style was changed to white-on-white, then printed and scanned. Fast and easy, but not accessible. Posted by: Michael Sullivan at February 8, 2006 9:22 PM Post a comment
Powered by Movable Type. Photo at top by Steve Woit.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT. |
|
Comments