Unshredding

Using software, of course. The context is shredded and torn East German Stasi documents, but the technology is more general of course:

The machine-shredded stuff is confetti, largely unrecoverable. But in May 2007, a team of German computer scientists in Berlin announced that after four years of work, they had completed a system to digitally tape together the torn fragments. Engineers hope their software and scanners can do the job in less than five years ­ even taking into account the varying textures and durability of paper, the different sizes and shapes of the fragments, the assortment of printing (from handwriting to dot matrix) and the range of edges (from razor sharp to ragged and handmade.) “The numbers are tremendous. If you imagine putting together a jigsaw puzzle at home, you have maybe 1,000 pieces and a picture of what it should look like at the end,” project manager Jan Schneider says. “We have many millions of pieces and no idea what they should look like when we’re done.”

Posted on January 23, 2008 at 4:19 PM43 Comments

Comments

Timmy303 January 23, 2008 4:47 PM

They did this in a fictitious television show in the 1980s called C.A.T. Squad. Had a Michael Dougles lookalike in the lead role. I was under the impression that, since it could be depicted on TV, reality must not be far away. I guess I was wrong 🙂

Berry January 23, 2008 5:12 PM

A friend who used to throw semi-annual game parties would, before the guests arrived, take two 1000-piece puzzles, dump them together on the dining table and then HIDE the boxes.

Reconstructing crosscut-shredded documents sounds much harder, especially if the shredders do like I do and shred unrelated documents (in my case, ALL my junk mail) just to make it more challenging.

The project of writing code to figure it out sounds like an interesting challenge, though.

Roy January 23, 2008 5:24 PM

A defense against unshredding chaff thrown out is to thoroughly mix the chaff in the hopper, then bag a quarter of that for today’s trash. When the hopper fills again, mix thoroughly, and take a quarter out to the trash. After several cycles you’ve got remnants of documents of several ages, with no complete document in any bag going out to the trash.

Kees January 23, 2008 5:25 PM

I thought they did a similar thing with the Dead Sea Scrolls in the early 90ies. IRRC the keepers of the scrolls had published photos of the parchment scraps which were then reassembled by pattern recognition software in the USA and published against the keeper’s wishes.

unary January 23, 2008 6:49 PM

@Bruce

So does this mean we now write our passwords on Bruce, instead of bit’s of paper in our wallet?

Michael Quinlan January 23, 2008 6:54 PM

“Shredded documents can be reassembled manually. After the Iranian Revolution and the takeover of the U.S. embassy in Tehran in 1979, Iranians enlisted local carpet weavers who reconstructed the pieces by hand. The recovered documents would be later released by the Iranian regime in a series of books called “Documents from the US espionage Den”. The US government subsequently improved its shredding techniques, by adding pulverizing, pulping, and chemical decomposition.”

from http://en.wikipedia.org/w/index.php?title=Paper_shredder&oldid=186138060

Jonathan Millett January 23, 2008 7:47 PM

I think I am safe until they can reconstruct a document from the ashes in my wood burning stove…

Draino January 23, 2008 7:58 PM

Interesting concept, if feasible… This reinforces my own personal policy of completely burning confidential documents in a remote area. Perhaps I’m just too paranoid 😛

Geoff January 23, 2008 8:19 PM

This isn’t so much “unshredding” as untearing. The excerpt you quote makes it clear: the machine-shredded stuff is confetti, and later on the article describes shredded paper as “unrecoverable”. The picture on the Wired article gives a clear indication of the size of the pieces being reconstructed.

I don’t mean to diminish the work they’re doing – this is a fantastic project, a rare intersection of history, social justice and technology, and they clearly have a great deal of very valuable work ahead of them. But this development doesn’t cast any extra doubt on the effectiveness of even simple paper shredders.

Bruce J January 23, 2008 10:12 PM

For an interesting claim of the reconstruction of burned documents see:
http://tinyurl.com/2nwwga
(the IRS evidence recovery paragraph)
I have use this company’s services in a more prosaic context for magnet coatings.

Back when I was in Her Majesty’s Canadian Forces, one way we got rid of papers we didn’t want others to read was to burn them, stir the ashes in water, and pour the result around.

Dale January 24, 2008 12:37 AM

The previous articles I have read about this mention they are only trying to put together the TORN documents, not the shredded ones.

A good shredder designed for keying material gives you a handful of dust as an output. There is not a hope in heck that you would every get anything from that except fire starter.

Cos January 24, 2008 1:24 AM

When I have a document I don’t want anyone to get and I care enough about that, I don’t shred it. Rather, I rip it into just a few pieces (with tears going through the most important bits) and separate them. One piece in the recycle bin at work, another piece in the recycle bin at home, another piece saved for next week’s recycling, etc. I think there’s a lot more security in making it unlikely that the pieces will turn up at the same place, than in making it hard for people to reassemble them.

Dirk January 24, 2008 1:51 AM

I remember some years ago shredder for home budget produced so chunky shreds you could simple put them together by hand.
Fortunately I have a fireplace to get rid of all personalized stuff.

Sparky January 24, 2008 2:50 AM

It appears as if they are only piecing together hand-torn pieces of paper. When done manually, the difficult part is finding the pieces that belong together, but in the normal process of tearing, they usually end up close to each other in the trash. Using software, it shouldn’t be very difficult to match the pieces, just scan everything at a reasonable resolution, and identify paper type, texture, color, color of printing or handwriting, and, the most difficult, i assume, the contour of the piece. Keep in mind that you cannot assume that the contour of two pieces that were torn apart, will match perfectly.

I think there is a theoretical limit to how far you have to shred a piece of paper, before it is absolutely unrecoverable. Assuming only the information printed on the piece, thus not the contours, texture, and such are used for the recovery, on could calculate the number of bits of data contained in a piece, and there would be no point in tearing it in more pieces than are bits on it. Say something in printed in 1×1 mm dots, there would be no point in tearing it in smaller pieces. There could be a problem if the tearing is not properly aligned with the information on the paper, that way a single piece would still have 4 bits of data on it, since a piece of a dot can still be read a 0 or 1.

Eventually, the same thing happens as trying to crack a one-time pad; a single document can be “recovered” into any other document, if is has (approximately) the same amount of ink.

One thing that is obviously different from cryptography, is that the document doesn’t need to be perfectly recovered to be readable. Also, only the ink contours need to be recovered, large patches of the same color are hardly interesting.

Sparky January 24, 2008 3:07 AM

I don’t quite understand why they tried to shred everything, for starters, they couldn’t have possibly thought it actually destroyed the information, and it is hardly efficient. A truckload of paper can be burned in a reasonably sized backyard in a couple of hours, leaving very few recoverable pieces.

232.8 C January 24, 2008 3:38 AM

A truckload of paper can be burned in a reasonably sized backyard in a couple of hours, leaving very few recoverable pieces.

Some will escape if it’s not enclosed; drifting off in the smoke. I’ve burned a few sheets of paper outdoors in a nearly-empty paint tin and that happened.

j0hnner_ca January 24, 2008 4:41 AM

So does this mean we now write our passwords on Bruce, instead of bit’s of paper in our wallet?

@ unary

Nonsense. Bruce is too famous for that. Everyone would see your passwords at conferences!

Besides, I don’t think he’d like it very much.

Roger January 24, 2008 4:59 AM

Recovery of burned documents refers to partially burned, or charred documents. In this case the document is still basically legible, but too brittle to examine, so it is stabilised with a polymer spray.

Which is why standard military and government protocols for destruction of documents by fire require either a machine designed for the purpose, or an attendant who rakes through the ashes, crushing them and looking for any unburned bits. With one exception, you cannot recover documents from ashes. This is because the ash represents only a tiny fraction of the mass of the paper; the rest has actually been turned to gas which is scrambling as fine and microscopic as possible. The exception is heavily glazed papers (glossy papers with all the pores filled with clay.) In that case quite a lot of the mass of the paper remains as ash, and it is potentially recoverable if the ashes aren’t crushed carefully. Of course, no-one prints classified documents on glazed paper.

The recovery of the ripped up Stasi documents is also an example of when “it isn’t done right”. With far too much to shred in too little time, they turned to ripping up by hand, and the results — as seen in the report — are risibly weak. Or are they? What makes Schneider’s team’s achievements impressive is the number of fragments: 600 million.

When shredding is good enough to destroy long range information, or if we use an algorithm that cannot exploit it (e.g. because we have too many documents to hold them all in mind, or our computer cannot parse text fragments into topics), then the difficulty of reconstruction of shredded documents goes roughly as the square of the number of pieces. If you simply tear every page in half, but you have 300 million pages, that gives you a problem more-or-less as difficult as reconstructing 35,000 page that have each been put through a confetti shredder. 300 million squared is approximately 2^56. But each fragment matching operation is thousands of times more complex than a trial DES encryption…

It’s impressive that they think they can do it in only 5 years. I would have guessed it wasn’t possible to do it by simple edge matching. That suggests their algorithm has some method of elementary parsing of fragments, to assign them to subclasses with a higher probability of matches. Either that, or they’re throwing an impressive amount of iron at the problem!

What does this tell us about security of home shredders? Not a great deal, really. If you have very valuable secrets, such that people are prepared to spend immense effort for years in order to recover them, then 300 million well mixed fragments per disposal may not be good enough.

If all you’re trying to get rid of is “pre-approved” credit applications, then 5,000 well mixed fragments per disposal is probably OK. Between those limits your mileage may vary, but always try to get the most fragments per page that you can reasonably afford, accumulate the greatest convenient number of pages before disposal, and mix them thoroughly.

Mr. Para January 24, 2008 6:32 AM

Even the youngest of these documents are closing on to 20 years old, right? So it will primarily be of interest to historians. Seems to me, even if they were all radable now, that the steps they took to ensure the secrets they documentet were kept secret were… good enough.

I’m also guessing that if the documents were really critically important, they would’ve been burned. And raked. And burned again. And then probably used as fertilizer.

unary January 24, 2008 6:50 AM

@j0hnner_ca at

Nonsense. Bruce is too famous for that. Everyone would see your passwords at conferences!

Besides, I don’t think he’d like it very much.

i’ve been under the impression from teh_interwebs and personal research that the brilliance of light emanating from Mr. Scheier would be enough to blind all those whom gaze upon him in the flesh…

Sparky January 24, 2008 6:55 AM

There are a few things to keep in mind; for starters, a home-grade shredder doesn’t leave edges that can be relatively easily matched like hand-torn pieces of paper.

Also, hand-tearing is usually done with multiple pages at a time, if you have a lot to tear. After the first tear, your typically put the two stacks together to tear it again; this means that the fragments of a single document will very often be close together in whatever container you throw it in.

The 300 million pieces are everything they have collected, which where taken from a large number of offices. Thus, there are many separate challenges, each containing a fraction of the whole puzzle. 30 sets of 10 million pieces each would be 30 * 10e6 ^2 = 3e15, as opposed to 300e6 ^2 = 9e16. Still quite a challenge, ofcourse.
I’d think it would be possible to reduce the complexity further, by dividing each set in a few simple classes, like printed or hand-written, type of paper, method of destruction.

The text itself could be useful in several ways; one thing to check would be if it lines up properly. OCR on the words that were torn through, to check if the two halves of the word make sense, or checks on complete sentences could also be used, once a relatively small set of possible matches is discovered using the other methods.

bob January 24, 2008 7:10 AM

@Sparky: These were Germans. They document EVERYTHING. (Thats why it took so long at the Nurmeburg trials to issue the predesignated verdicts, they actually tried to read the documents.) They didnt have a truckload to shred or even a couple dozen truckloads. They had long railroad freight train loads. And of course they couldnt start shredding until the govt (agency) was about to fold. I picture the same thing happening (smaller scale of course) on the USS Pueblo or the US embassy in Tehran.

I have often looked at shreddings and mentally imagined the algorithm I would try to use to put the pieces back together after they had been scanned. Definitely a job for supercomputing.

Actually shredded documents burn much better, all the airspace around the pieces so shred them before burning just for convenience.

I dont burn mine or split the batches; it would be more secure, but with my personal documents the cost/benefit isnt there. However I do shred nonsensistive stuff in with it for leavening and then put it in with my household waste and if possible throw any outdated meat or spoiled food in the trash bag with it for a couple of days, especially if it has gravy or soup.

The latest generations of GSA-style shredders produce an output that is better described as dust, it has no measurable dimensions; the period in a 10-pt font would probably spread across 6-10 pieces. I wish some of that capability would trickle down to the home market, the huge chunks my home shredder produces could have an entire SSAN on a single piece if it was in a small font and had been printed landscape instead of portrait.

I wonder if they will open the files up to anyone as a public attraction, kind of like we have done with Ellis Island records. I may have to go back and see if they noticed me.

Paeniteo January 24, 2008 7:44 AM

@Colossal Squid: “Shred, then compost.”

They didn’t quite have the time to do that then…

Paeniteo January 24, 2008 7:51 AM

@bob: “I wonder if they will open the files up to anyone as a public attraction, kind of like we have done with Ellis Island records. I may have to go back and see if they noticed me.”

You have the option now, already, for material that was not destroyed at all or that has been recovered in the meantime. You may request Stasi files relating to your person from the appropriate german government agency:
http://en.wikipedia.org/wiki/BStU

More information in the german Wikipedia:
http://de.wikipedia.org/wiki/BStU#Einsichtnahme_in_Stasi-Unterlagen

Nostromo January 24, 2008 8:04 AM

@Draino: “Perhaps I’m just too paranoid”

The question to ask is not, “Am I being paranoid?”

The right question to ask is, “Am I being paranoid enough?”

In a lot of places in Europe it’s illegal to burn things, for pollution reasons. What we need is a compact paper-burner with a fan that sucks the exhaust gases through water, to scrub out the giveaway smoke particles.

Lyle January 24, 2008 9:27 AM

The right question to ask is, “Am I spending my limited efforts addressing the most likely threats, and not just the most dramatic ones?”

anon1234 January 24, 2008 11:10 AM

A former NSA worker once told me that the three semi-trailer-sized shredders in the basement at Fort Meade in the 1960s were designed to take huge volumes of paper and turn them into something resembling talcum powder — which was then still considered classified at the level of the original documents. His opinion was that this was overkill, but he wasn’t directly involved in the system design, so he couldn’t be sure.

Matthew Skala January 24, 2008 1:48 PM

What was pointed out last time the security of shredded documents came up in this Web log, was that shredding has an advantage of immediate verifiability over burning. If you stick the document in the shredder you can see it turned into confetti right before your eyes, right there in the office, immediately. You can’t really operate an incinerator inside the average office on a moment’s notice, so you have to save the document until a more convenient time (which means it’s still in existence, and possibly a threat, while it waits) and/or give it to somebody else to destroy (which means they are trusted, in the technical sense: they have the ability to override your security policy). Both those are undesirable. If you can apply at least some level of secure destruction that’s immediately verifiable – i.e. shredding – before you submit it to the process that eventually leads to the not-immediately-verifiable flames, then you’re better off.

Remember Winston Smith? He burned all his secret documents.

DLL January 24, 2008 2:22 PM

My strategy is similar to Bob’s, but instead of meat or gravy… I use dirty diapers. Bioweapons program meets document destruction.

CipherChaos January 24, 2008 10:31 PM

An idea: Shred, then flush? Who’d want to dig for that?!

@raffi:
“shred, then burn?”

I’ve always liked that idea: Shredding makes the paper support combustion much more easily, and burning it make sure there’s nothing to reassemble.

It’s not practical for large amounts of documents, though, because it’s going to attract a lot of attention in the form of smoke.

@Dale:
“A good shredder designed for keying material gives you a handful of dust as an output…”

Where do I get one?! I want!

@232.8 C [clever name BTW]:
This is because a flame creates it’s own wind; the updrafts carry the fragments away.

@Roger:
I’d love to have the disintegrator Dale mentions…

@The faux Bruce:
I suppose “Bruce Schneier” isn’t a reserved word for the name field. Interesting.

vwm January 25, 2008 11:15 AM

@Roger: “It’s impressive that they think they can do it in only 5 years”

I’d guess the complexity is reduced as those 600 million fragments have not been randomised completely, but are somehow clustered – i.e. very often all the bits of one document will be in the same garbage bag.

Quite a lot of those STASI documents have been reconstructed manually already – which would probably be impossible if someone really would have to search through millions of fragments.

@everyone suggesting burning, draining etc.: Stasi did try that, but there where just to many documents. They had to get rid of the documents in secret, as the revolution was already almost over when they started destroying the evidences of there crimes.

In Leipzig the destruction was stopped by the community after Stasi tried to flush the documents: The Paper plugged the canalisation and the resulting flooding of the premises made people suspicious…

Not paranoid, just thorough January 25, 2008 1:16 PM

Yes, I shred all my sensative paperwork with a crosscut shreder. But I go the extra mile. I use the shredded paper as tinder for lighting fires in my fireplace, and I dispose of the ashes with used kitty litter. While it may still be technically possible to reassemble the documents, I can’t imagine that anyone actually wanting to.

Roger January 25, 2008 7:46 PM

@Nostromo:

“In a lot of places in Europe it’s illegal to burn things, for pollution reasons. What we need is a compact paper-burner with a fan that sucks the exhaust gases through water, to scrub out the giveaway smoke particles.”

Many places that require higher security than shredding, but cannot incinerate (often for safety reasons) use a pulper. There are industrial grade pulpers that can eat an entire dumptruck of paper at once, and GSA office grade pulpers that consume about 8 lbs/minute (about 900 pp/min.)

However for pulping small amounts e.g. for home use, you can use an old food blender. Pick one up secondhand for $10 or less, add a cupful of warm water per page (up to 3 or 4 pages at a time depending on the size and power of your blender, YMMV), let it soak for 30 seconds to soften then hit the “liquefy” button. You can get destruction (almost) rated for top secret, from a machine which costs less than the cheapest shredder. Main disdavantage is that a food blender is much slower than a shredder.

Small amounts of pulp can be tipped into a toilet, larger amounts can be composted or used for craft projects, very large amounts can be recycled. GSA pulpers for classified documents require all the pulp to be passed through a screen to check for any bits that may have been missed, but this step is probably overkill for home or small office use, especially if the pulp is going to be flushed or composted.

One caveat with using a blender for pulping: they really react badly to document waste that isn’t paper, such as paperclips, staples or plastic inserts.

Anonymous January 30, 2008 3:22 AM

@Roger:

Just hold the lid on tight, and wear safety goggles. Those warnings on “Will It Blend?” are for wimps… don’t worry about metal objects.

If your blender gets killed by those, you need one from the company that makes those clips! 😉

Clive Robinson July 9, 2008 12:13 PM

If you are going to liquify why use water?

The absorbancy rate of modern paper used for laser printing and photocopying is low (which is why its not good for high quality ink jet work).

Also as noted above the “finer finish” paper has either china clay or talcum powder used as a surface finishing/polish (as do some ink jet papers).

low grade (two stroke) petrol or other light hydrocarbons have a considerably higher absorbtion rate and if the “slurry process” is done corectly can actually be used as a replacment for heavy oil in some older water heaters etc.

Also there are semi industrial processes where the hydrocarbon is recovered almost entirley from the “pulp” (getting essential oils from such things as rose petals etc). This would leave you with pressed out “fire bricketts” of the pulp ready to use as fire lighters or as a replacment / augment solid fuel “coal/coke” burners

Odd as it might sound as you are burning mainly paper in the brickets it is almost as “environmentaly friendly” as burning wood…

Also if you are going to use water why not add sugar and an appropriate yeast to it and put the slury in a meathan digester to make biogas to augment / offset your use of mineral “natural gas”.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.