File deletion is all about control. This used to not be an issue. Your data was on your computer, and you decided when and how to delete a file. You could use the delete function if you didn't care about whether the file could be recovered or not, and a file erase program -- I use BCWipe for Windows -- if you wanted to ensure no one could ever recover the file.
You have to trust that these companies will delete your data when you ask them to, but they're generally not interested in doing so. Sites like these are more likely to make your data inaccessible than they are to physically delete it. Facebook is a known culprit: actually deleting your data from its servers requires a complicated procedure that may or may not work. And even if you do manage to delete your data, copies are certain to remain in the companies' backup systems. Gmail explicitly says this in its privacy notice.
Online backups, SMS messages, photos on photo sharing sites, smartphone applications that store your data in the network: you have no idea what really happens when you delete pieces of data or your entire account, because you're not in control of the computers that are storing the data.
This notion of control also explains how Amazon was able to delete a book that people had previously purchased on their Kindle e-book readers. The legalities are debatable, but Amazon had the technical ability to delete the file because it controls all Kindles. It has designed the Kindle so that it determines when to update the software, whether people are allowed to buy Kindle books, and when to turn off people's Kindles entirely.
Vanish is a research project by Roxana Geambasu and colleagues at the University of Washington. They designed a prototype system that automatically deletes data after a set time interval. So you can send an email, create a Google Doc, post an update to Facebook, or upload a photo to Flickr, all designed to disappear after a set period of time. And after it disappears, no one -- not anyone who downloaded the data, not the site that hosted the data, not anyone who intercepted the data in transit, not even you -- will be able to read it. If the police arrive at Facebook or Google or Flickr with a warrant, they won't be able to read it.
The details are complicated, but Vanish breaks the data's decryption key into a bunch of pieces and scatters them around the web using a peer-to-peer network. Then it uses the natural turnover in these networks -- machines constantly join and leave -- to make the data disappear. Unlike previous programs that supported file deletion, this one doesn't require you to trust any company, organisation, or website. It just happens.
Of course, Vanish doesn't prevent the recipient of an email or the reader of a Facebook page from copying the data and pasting it into another file, just as Kindle's deletion feature doesn't prevent people from copying a book's files and saving them on their computers. Vanish is just a prototype at this point, and it only works if all the people who read your Facebook entries or view your Flickr pictures have it installed on their computers as well; but it's a good demonstration of how control affects file deletion. And while it's a step in the right direction, it's also new and therefore deserves further security analysis before being adopted on a wide scale.
We've lost the control of data on some of the computers we own, and we've lost control of our data in the cloud. We're not going to stop using Facebook and Twitter just because they're not going to delete our data when we ask them to, and we're not going to stop using Kindles and iPhones because they may delete our data when we don't want them to. But we need to take back control of data in the cloud, and projects like Vanish show us how we can.
Now we need something that will protect our data when a large corporation decides to delete it.
This essay originally appeared in The Guardian.
Posted on September 10, 2009 at 6:08 AM • 62 Comments