File Deletion

File deletion is all about control. This used to not be an issue. Your data was on your computer, and you decided when and how to delete a file. You could use the delete function if you didn’t care about whether the file could be recovered or not, and a file erase program—I use BCWipe for Windows—if you wanted to ensure no one could ever recover the file.

As we move more of our data onto cloud computing platforms such as Gmail and Facebook, and closed proprietary platforms such as the Kindle and the iPhone, deleting data is much harder.

You have to trust that these companies will delete your data when you ask them to, but they’re generally not interested in doing so. Sites like these are more likely to make your data inaccessible than they are to physically delete it. Facebook is a known culprit: actually deleting your data from its servers requires a complicated procedure that may or may not work. And even if you do manage to delete your data, copies are certain to remain in the companies’ backup systems. Gmail explicitly says this in its privacy notice.

Online backups, SMS messages, photos on photo sharing sites, smartphone applications that store your data in the network: you have no idea what really happens when you delete pieces of data or your entire account, because you’re not in control of the computers that are storing the data.

This notion of control also explains how Amazon was able to delete a book that people had previously purchased on their Kindle e-book readers. The legalities are debatable, but Amazon had the technical ability to delete the file because it controls all Kindles. It has designed the Kindle so that it determines when to update the software, whether people are allowed to buy Kindle books, and when to turn off people’s Kindles entirely.

Vanish is a research project by Roxana Geambasu and colleagues at the University of Washington. They designed a prototype system that automatically deletes data after a set time interval. So you can send an email, create a Google Doc, post an update to Facebook, or upload a photo to Flickr, all designed to disappear after a set period of time. And after it disappears, no one—not anyone who downloaded the data, not the site that hosted the data, not anyone who intercepted the data in transit, not even you—will be able to read it. If the police arrive at Facebook or Google or Flickr with a warrant, they won’t be able to read it.

The details are complicated, but Vanish breaks the data’s decryption key into a bunch of pieces and scatters them around the web using a peer-to-peer network. Then it uses the natural turnover in these networks—machines constantly join and leave—to make the data disappear. Unlike previous programs that supported file deletion, this one doesn’t require you to trust any company, organisation, or website. It just happens.

Of course, Vanish doesn’t prevent the recipient of an email or the reader of a Facebook page from copying the data and pasting it into another file, just as Kindle’s deletion feature doesn’t prevent people from copying a book’s files and saving them on their computers. Vanish is just a prototype at this point, and it only works if all the people who read your Facebook entries or view your Flickr pictures have it installed on their computers as well; but it’s a good demonstration of how control affects file deletion. And while it’s a step in the right direction, it’s also new and therefore deserves further security analysis before being adopted on a wide scale.

We’ve lost the control of data on some of the computers we own, and we’ve lost control of our data in the cloud. We’re not going to stop using Facebook and Twitter just because they’re not going to delete our data when we ask them to, and we’re not going to stop using Kindles and iPhones because they may delete our data when we don’t want them to. But we need to take back control of data in the cloud, and projects like Vanish show us how we can.

Now we need something that will protect our data when a large corporation decides to delete it.

This essay originally appeared in The Guardian.

EDITED TO ADD (9/30): Vanish has been broken, paper here.

Tags: academic papers, Amazon, cloud computing, control, data destruction, email, essays, Facebook, feudal security, Google, iPhone, privacy, social media, Twitter

Posted on September 10, 2009 at 6:08 AM • 63 Comments

Comments

davidheath • September 10, 2009 6:34 AM

one can only hope that local encryption in association with cloud storage will tilt the ship somewhere back to an upright state

JS • September 10, 2009 7:09 AM

The distinguished Radia Perlman wrote a notable paper on this topic: http://portal.acm.org/citation.cfm?id=1131279.

A Nonny Bunny • September 10, 2009 7:15 AM

Once data is public, you lost control. A system like Vanish won’t help, because as you say, someone can just copy the data.
If you want to keep control over your data, you need to keep it private; if you also want to store it publicly (e.g. gmail), then you need to encrypt it. Delete/forget the key, and it’s gone.

Thomas • September 10, 2009 7:27 AM

“We’re not going to stop using Facebook and Twitter …”

because we never started using it 🙂

niranjan • September 10, 2009 7:44 AM

What one can falsely infer here is that the data itself will have an expiry date set. So if even the public data from your blog or twitter is archived in archive.org or Google, its get deleted.

Apart from the existing simple solution of encrypting private data stored either locally or on cloud, this new solution doesn’t do anything more, it just deletes the key albeit in a novel way and automatically.

clvrmnky • September 10, 2009 7:48 AM

@Thomas:

I suspect he is using the word “we” to mean the computer and internet using public in the aggregate. The fact is that these services are wildly popular with many, regardless of the individuals who decided not to participate.

@Bunny:

Bruce does, in fact, point this out himself. However, it is important to point out that we aren’t talking about precious or sensitive information here — just everyday photos, comments and emails. Being able to control the /source/ of this mundane material is at the heart of Bruce’s essay. In this respect, Vanish is a perfectly reasonable response (at least in terms of research) because the fact is most folks leave the stuff they get from others on the cloud.

That is, I know of no one who religiously copies their Gmail email to another account they have complete control over. How many of us save copies of every Facebook image or comment locally? How about IM chats? You are right that there isn’t much we can do about that, but this is not really the point.

Vincent • September 10, 2009 8:14 AM

Seems as though as long as views and clicks subsidize storage and retrieval, it’ll always be cheaper to keep it than to vanish it.

Alan Porter • September 10, 2009 8:19 AM

“We’re not going to stop using Facebook and Twitter just because they’re not going to delete our data when we ask them to…”

I think you mean “they’re not going to delete THEIR data”.

We seem to think that the stuff we contribute to these web sites is somehow OUR data. It’s not. It’s THEIRS. But they thank you for submitting it.

B. Real • September 10, 2009 8:20 AM

While I agree with the basic premise, I have to take issue with nomenclature.

Gmail (and its brethren) is not a Cloud Computing Platform, it is SaaS that may [or may not] run “in the cloud”. Data is not at risk because of the cloud, but rather because of our entrusting it to a third-party service provider. I have virtually no control over the data that I entrust to a SaaS provider, while I have almost complete control over the data that I put in to my cloud platform.

This is the same problem, only magnified, that caused similar angst in the days of The Well, CompuServe, and your local BBS.

Pádraig Brady • September 10, 2009 8:36 AM

Even programs running locally aren’t guaranteed to wipe a file. Due to the complexity of modern journaling file systems, your data could be anywhere on disk.

David • September 10, 2009 8:39 AM

I think that what we really need is a change in attitude.

Most of what I tell people is ephemeral, and I can generally count on it not going much further. This means I’m free to say embarrassing things about myself, or things that could conceivably get me in trouble. I consider this to be healthy.

Thing is, this doesn’t work on the web, and hence on Facebook. Once it’s out there, it’s out. More people than you expect will get a copy, and any of them could save it. Any site you put it on could keep a copy, whether or not they say they won’t, or even if they enable a scheme to let the public copy degrade to unreadability automatically.

Now, Facebook is the main way I have of keeping track of certain friends and relatives, so it’s very tempting to talk to them on Facebook as if we were physically together in public. From the point of view of privacy, it’s nothing like it. I get into conversations with friends of friends all the time (using the Facebook definition of “friend” here), and so if a mutual Facebook friend of ours starts a conversation or posts something, I can see it.

I said above that we need a change in attitude, and I’m not sure what. One possibility is that people become more guarded about what they post about themselves and their friends. Another is that people in general stop taking public personas as seriously, and recognize that most people do things and say things not meant for public consumption.

Contrary Guy • September 10, 2009 8:52 AM

@ Alan Porter

Actually, the content IS ours, we wrote it, we have copyright to it. Perhaps we are automatically giving them licence to copy by using the service, but the content remains our own.

Do you think that the RIAA gives up control over every song they put up on iTunes?

Clive Robinson • September 10, 2009 9:03 AM

@ Bruce,

“It just happens.”

Naughty Naughty, you know that’s not realy true 😉

Such are the joys of trying to put over a complex idea in simple terms.

Clive Robinson • September 10, 2009 9:18 AM

I have actualy designed systems that do something similar in the past.

There are a number of issues.

The first and simplest is an encryption system where the “whole file” is required to decrypt it (Bruce has posted about such systems).

The second is ensuring that you genuinly split the file. That is you can absolutly guarenty that imutable copies of the parts cannot be agrigated by another party (think a three way split, three organisations with three backup tapes and an entity with three court orders 😉

Which highlights the real issue it is an active system if you break the system then there are no guarenties on the outcome.

Oh and if you chose to reveal the plain text to another person (or device) at any time then it can be copied.

Data destruction is actually a much harder problem than data availability, and lets be honest we have not cracked that problem yet 8(

Alexey • September 10, 2009 9:28 AM

Probably it will be offtopic, but..

Speaking about online backups on untrusted servers, what do you think about rsyncrypto encryption algorythm?

http://rsyncrypto.lingnu.com/index.php/Algorithm

They claim that it adds possibility of incremental backups in exchange on ‘slightly’ reduced encryption strength, but I’m, because I’m pretty lame in the cryptography, not sure how serious this ‘slightly’ should be taken.

Having all data encrypted meand I shouldn’t care about it deletion, I can just remove the key and be safe.

Any comment on that?

Thank you.

Tom • September 10, 2009 10:07 AM

Just a quick question regarding “erasing” data. Is it possible to recover data after a single pass over-write with pseudo-random data without using specialized equipment, i.e. just using the standard hardware that most PCs come equipped with?

Kenneth Mayer • September 10, 2009 10:25 AM

The only issue with data deletion is that it never really is gone if it is in the cloud. They do have tape backup and virtual tape backup.

Vincent • September 10, 2009 10:27 AM

Also makes me consider the way we tend to approach open source and open content, with very little thought or provision for revocation.

David • September 10, 2009 10:45 AM

@Tom: No, the overwrite is satisfactory deletion for most practical purposes.

However, I wouldn’t be too surprised if the NSA could read a disk file that’s been overwritten once, and I don’t know what will happen in the future. It may be that there simply isn’t enough history left for reconstruction, but if I were worried about either the NSA or less resourceful people twenty years from now, I’d destroy the disk.

Note that this becomes less reliable as the data medium gets smarter. If the disk drive is smart enough to recognize and remap bad sectors, it may leave a mostly readable sector even when the file itself is overwritten. Flash memory (like in thumb drives) does that as a matter of course, to even wear. These may not be readable by the average script kiddie, but I wouldn’t be surprised if a competent hobbyist was able to read them.

E.D F. • September 10, 2009 11:07 AM

Since that data is likely to be trawled by search engines, that data will be available in various caches, publicly. The copy will be made automatically, even if nobody decides to keep a copy. Automated keyword recognition will cause copies to be made by even more bots if the content matches what those bots are looking for.

Bryan Feir • September 10, 2009 11:11 AM

@Tom:

As David says, no it isn’t possible, at least as long as you’re sure you’re overwriting the actual sectors used by the file. Which is normally true for an in-place overwrite on a disk, though issues with journaling file systems can complicate that.

Even with specialized equipment, it’s a lot more difficult to recover overwritten data than it used to be. Many years ago you could often pick up data from the sides of the data track because the head didn’t necessarily track the old path exactly, and the older data may have spread out more than the recently-written data. Modern high data densities require much more precise tracking, and a much stronger magnetic field, reducing the amount of data left behind.

Nostromo • September 10, 2009 11:12 AM

“As we move more of our data onto … closed proprietary platforms such as the Kindle and the iPhone, deleting data is much harder … We’ve lost the control of data on some of the computers we own”

You don’t have to move any of your data onto “closed proprietary platforms”. If you do, well, you deserve the consequences. How about taking some personal responsibility?

Jack • September 10, 2009 11:13 AM

And what would you do with “something that will protect our data when a large corporation decides to delete it”? If someone showed you something that made Vanish look like a toy and blew away all encryption-derivative-type stuff, what would you do? Write about it? Analyze it to death, only to say we can never, really, really, be sure? Or would you call it snake oil and move on?

james • September 10, 2009 11:37 AM

you single out the “cloud”, but a lot of this is true even for a standard corporate environment or any other place where the computer may not be entirely managed by the user. all of these places could have automated backup processes that keep data alive long after you have “deleted” it from your local disk drive.

William V. • September 10, 2009 12:00 PM

Your analogy with file recovery and the use of BCWipe is more than an analogy. I recently used file recovery software to retrieve files deleted from other people’s EC2 AMI:

http://stage.vambenepe.com/archives/922

And actually Amazon AWS does a very good job resetting blocks on raw volumes that they hand out. Much better than you can expect from most “private cloud” infrastructure providers out there. Just try and analyze the blocks next time you’re handed a VM with some volumes mounted.

Tom • September 10, 2009 12:21 PM

Thank you for the responses. I feel a bit safer knowing that specialized equipment is required. Right now I use “eraser” and have it set to use pseudo-random over-write of unused disk space once a week. I also use it in 2-pass pseudo-random mode to erase files instead of using the delete function in windows. From the comments, any attempt to recover the data would prove costly and hence low risk since my data isn’t special enough.

Chaz • September 10, 2009 1:11 PM

@clvrmnky “I know of no one who religiously copies their Gmail email to another account they have complete control over…”

I do. My gmail is downloaded to two geographically separate hard drives, and some of it is encrypted so the cloud can’t read it. I don’t actually care about my gmail all that much–I just set it up that way as practice, after reading this blog too often.

Having extra copies of things makes them harder to delete, but encryption with a distributed key may be a good start.

Caleb • September 10, 2009 2:54 PM

Richard Stallman wrote a bit about this in his essay about “Trusted” Computing.

http://www.gnu.org/philosophy/can-you-trust.html

Daniel • September 10, 2009 3:04 PM

“And even if you do manage to delete your data, copies are certain to remain in the companies’ backup systems. Gmail explicitly says this in its privacy notice.”

Right. Actually, the Gmail Privacy Notice says it “may” remain in the backup systems. Any company who doesn’t say so is simply lying, so I always thought that extra bit of honesty was an excellent move on part of Google.

“Unlike previous programs that supported file deletion, this one doesn’t require you to trust any company, organisation, or website. It just happens.”

No, it does require trust. It requires that you trust that nobody who viewed the file (and got the key in time to do so) leaves a copy of the key behind, for example on their backup system. The only difference is whether you trust someone whom you chose (e.g. Gmail), or whether you trust someone random (any participant in the Vanish p2p net that requested the keys).

I guess one could make the case for either.

sle • September 10, 2009 3:27 PM

With SSD drive, I guess the problem is even worse as the data medium is using user “transparent” data reallocation techniques in order to increase lifetime.
How can you be sure you removed a file on an SSD or USB key ?

dan • September 10, 2009 4:05 PM

When I started reading this long post about file deletion, I figured out, oh it’s again just a story on how people press the DEL key and regret their action.

Then “vanish” came, another spammy commercial thing ?

Not at all, the gossip turned around the fact that how long google or facebook are going to keep my pix of myself on the beach in a bathing suit with my kids and wife.

The best solution is to “not publish” that kind of data.

who cares if now or in 10.000 years, a homo erectus was proud to show his life to the world.

I’ve worked in the plane manufacturing industry and the median retension time is 40 years.

By this time, you will be dead and eaten by maggots, when some geek will restore those pix. and will “LoL” out on the strange practises of the post-industrials geeks, we all are.

publish publish my brothers, but don’t complain or blame the inventors of the best communication tools ever invented.

As it says on most of the legal notices : I accept terms and conditions….

So if you are scared on how your personal data will be handled for the next 200 years. Don’t publish !!!

Create your own “facebook site” invite the community, friends, relatives or strangers to share your lack of recognition.

Name it :Jesuspace or whatever

the problem is that, it is just easy to put things on the web as putting a pizza in the mw oven.

World of sheeps !

my best advice : a 100000 bucks fee to become a member of MSP or FB or GOOG.

this will tremendously reduce the members amount, and force people to read books and write letters to friends.

Sorry for this “one way sarcastic critic” of the ruined mental deprecated mankind mind.

what would happen if we go back to a pen, rubber and paper philosophy.

Dror Harari • September 10, 2009 5:04 PM

@sle is right – when you use SSD (Solid State Disk), you cannot actually write over a file. In virtually all cases, you simply write new copies on other places in the disk. Wiping programs intended for use with hard drives just make the problem worse when using an SSD device since you are just adding more copies of the file on the storage.

As for on-line storage, I actually wonder what would be the effect of repeated editing of an online document. I tend to think that while certainly companies hold backups, they do not necessarily store all versions of a file so if you can edit a file stored online, replacing all characters gradually with X (including spaces) you might actually get your document to be effectively deleted. A lot of ‘might’, ‘possibly’ and ‘likely’ but worth considering…

Clive Robinson • September 10, 2009 5:24 PM

@ Tom,

“Is it possible to recover data after a single pass over-write with pseudo-random data without using specialized equipment, i.e. just using the standard hardware that most PCs come equipped with?”

Unfortunatly the answer is “it depends”.

All hard disks these days have their own microprocessors and semi specialised hardware.

However two things are true in many cases,

1, To reduce costs the electronics are usually designed to be used with “sloppy” mechanics.

2, The firmware in the hard disk controler may well be flash memory.

As David and Bryan Feir pointed out “usually” you would require specialised equipment to be able to get at the head amps and head position actuator.

However if somebody who knows a lot about the drive reprograms the hard drive controler CPU then it may be possible with some drives.

The days of MFM and RLR 3/4 coding on the hard drive platters is long gone and some researchers say that it is not possible to read any residual old data off of the hard drive platter.

However the paper they published did not go into sufficient detail for me to judge if the tests they did where sufficient to make the claim.

For the claim to be true the old data must be sufficiently overwritten such that the residual magnetic field it leaves is smaller not just than noise but smaller than any corelation or averaging techneiques that might lift the residual signal above the noise floor.

That being said the question then arises,

‘Why would anybody want to do it to your hard drive?’

And the simple answer if they can physicaly get at it what would be the advantage to them…

Paranoia aside it is often difficult for ordinary users to tell how valubale the data on a hard drive is.

For instance an email from an engineer to a project director may end up on the director’s laptop. The director may in turn allow their secretary or another director access to their files etc to alow a report to be written.

The secretary or financial director may not know how valuble a few apparently meaningless numbers may be.

The is also a further problem in that you do not know today just how valuable a file might be in future times (say a photo of a company day at the beach where a young lady is seen topless, she then later goes out with a member of the royal family).

So for what is a very low cost in ordinary terms and possibly to small to measure compared to the unknown risk / cost of disclosure you might as well take a few minor precautions.

Such as don’t alow other people to access your files, use a fully encrypted hard drive, do a multipass overwrite on files under ordinary use and a secure destruction method at the end of the disks life (if you don’t care about the environment then a sledge hammer and good bonfire should do the job 8)

Roy • September 10, 2009 5:42 PM

Tom,

Thanks for the idea of regularly overwriting unused space. After mulling over how, I worked out a simple way to generate the commands to overwrite up every whole GB of free space (and then to release it).

On my laptop, ‘spoiling’ 210 GB takes less than an hour and a half.

Godel • September 10, 2009 6:26 PM

I use eraser as my general file delete program, but using it on free space on a regular basis would cause me problems.

I have found it caused Kaspersky 2009 to forget its security settings for programs (high or low risk etc.) I speculate Kaspersky writes some hidden files in the free space, a little like Truecrypt’s hidden volumes.

Rahul • September 11, 2009 12:38 AM

Hello,

I feel the strict application of data retention laws can solve the problem to a large extent. Also the laws have to be improved so as they cover all types and mediums of backups where the data is stored. It should also mention the proven technique, tool or program to use to make sure the data can no longer be recovered and is completely deleted.

Thanks.
Rahul.

Jeffrey • September 11, 2009 1:31 AM

  After taking numerous classes at the DCITA (http://www.dc3.mil/) I have learned a lot of techniques that are used to recover data.
 As far as most law enforcement agencies (and DCITA) are concerned you overwrite anything a few times and it is pretty much toast unless you want to pay big dollars to have someone attempt recovery efforts using a rotating electron microscope (i.e. the NSA is about it).  The NSA considers a hard drive properly destroyed only if you drop it into a metal shredder.
 As far as my personal preferences go for data destruction, I normally do the following:

Anything of any importance to me (i.e. encrypt or decrypt files or read sensitive information) inside of VM that has its entire hard disk image fully encrypted (i.e. I installed the Windows OS in VMware and then full disk encrypted it with TrueCrypt -afterwards I made my initial snapshot).
On my host system, I have my pagefile and any system caching services disabled completely. The pagefile is a goldmine to a forensic examiner. Some of the “state” data for VMs can be sometimes retrieved from the host’s pagefile.
When I am not using my computer, I wipe my system completely (to include the areas where I store my VM images) with the program called CyberScrub.

http://www.cyberscrub.com/products/privacysuite/

 For anything else that is important to me that I need to actually store, I use my IronKey for this purpose.

kevinm • September 11, 2009 4:31 AM

@Tom,
after a track on the disk is overwritten it is just not possible to recover it on a realistic budget. However you have to be sure that it is overwritten. On most filesystems it is a difficult problem to erase data in unused space and metadata space if access to the disk is via the operating system. See “Disk Wiping by Any Other Name by Hal Berghel and David Hoelzer” and work by Simson Garfinkel.

If I use an erase program on an NTFS formatted disk that is still mounted and then afterwards mount it in a Linux system and examine it with ‘foremost’, ‘dd’ and ‘strings’ then I will see some data remenants, perhaps not whole files but chunks of data in the $MFT and ADS.

g • September 11, 2009 9:09 AM

How can you open an encrypted Google doc without sharing the key with the Google cloud ?

Joe. • September 11, 2009 9:45 AM

I have been hoping someone would come up with a form of personal DRM for a while. This sounds like it has promise and I hope people keep up the research.

Mike. • September 11, 2009 9:48 AM

A complaint was made to the Canadian federal Privacy Commissioner in the spring, claiming that Facebook violated Canadian law by not deleting information when a user closed their account or removed content. The Privacy Commissioner agreed and Facebook agreed to make changes.

http://www.priv.gc.ca/media/nr-c/2009/nr-c_090827_e.cfm

suomynona • September 11, 2009 10:05 AM

“We’re not going to stop using Facebook and Twitter just because they’re not going to delete our data when we ask them to, and we’re not going to stop using Kindles and iPhones because they may delete our data when we don’t want them to.”

Why not? Is there a gun pointed at your head telling you to use social networking?

web 2.0 social apps are ruining people. you wouldnt tell a stranger on the street intimate details about yourself, yet we post these things in public on the web (i dont care what privacy setting you use, IT’S PUBLIC).

i strongly advocate facebook suicide.

Let’s end this voyeuristic state of being.
That’s what porn is for.

berkut • September 11, 2009 11:02 AM

Use shred under Linux. Anybody?

David • September 11, 2009 1:04 PM

@suomynona: I find Facebook very useful in keeping track of friends and relatives I don’t see as often as I like. It’s efficient, being a broadcast medium that allows conversations. I don’t see that we can get what I find useful about Facebook without the associated problems.

bob • September 11, 2009 1:29 PM

“…We’re not going to stop using Facebook and Twitter …”

I wont join them with the EULA as is. I recently turned down a very lucrative checking account because it required electronic statements which, in turn, had a EULA that basically shifted any responsibilty for password/information/monetary loss from the bank to me.

Come on people, its grade three basics: read everything before you sign it, and DONT sign if its disadvantageous. When I applied for my first mortgage (a construction loan), the mortgage company had a line in the fine print which said “and if the builder goes bankrupt, we retain the right to hire another company to finish it and bill you for the cost”. My response was “not just no, but HELL NO”. They said “oh, that’ll never come up”. So I said “in that case you wont mind removing it”. They said they couldnt change it so I walked out. A couple of days later they agreed to remove that line. I signed. 1 year later, house still unfinished, the builder went bankrupt. I had lost my job the same week, so my obstinacy probably saved my life (I still live in that house).

JUST SAY “NO” (or for python fans – “ni!”). Its simple survival of the fittest – if enough people stopped lining up to get this crap then better stuff would come along to replace it.

Hello All • September 13, 2009 3:31 PM

Good article Bruce;

While Vanish seems like a useful tool, A Nonny Bunny said it well, ”
Once data is public, you lost control. A system like Vanish won’t help, because as you say, someone can just copy the data.
If you want to keep control over your data, you need to keep it private; if you also want to store it publicly (e.g. gmail), then you need to encrypt it. Delete/forget the key, and it’s gone. ”

So if ideas like Vanish are implemented wide-scale, companies would likely start configuring systems to automatically make redundant/offline copies of our data (if they don’t already…) as soon as the data is posted. Bottom line in the “digital” era: if you don’t want it seen by everyone potentially forever, don’t post it in a public place at all.

As for comments on low-level issues like file recovery from hard-drives, this certainly is possible, but probably not practical on large-scale systems. Sites like gmail and facebook have hundreds of thousands of harddrives in thousands of server farms and very complex filesystems and ways of storing the data. This means if a person deletes something at a “high-level” web-style interface, it’s unlikely that the exact sectors of the exact harddrive(s) the data in question is on will not be overwritten before the company/authorities can find it and attempt recovery. Companies who want to retain data would then likely use a redundant/offline data backup solution.

Jason • September 13, 2009 5:56 PM

The problem with Vanish is that the entire key set is stored in a publicly accessible key/value store. Since the system is available to everyone, it should be relatively easy to write a system which will store all of the decryption keys for long term use.

In other words, the attacker retrieves all of the decryption keys for all of the files. That will enable the attacker to decrypt any file stored using Vanish on demand.

Matt • September 14, 2009 6:37 AM

Some of the schools already have aged equipment which gives the tendency of losing the student files. A disk-based solution is all you need for this problem.

neill • September 14, 2009 7:43 AM

just put an encryption key in your temp folder, and have it cleaned after reaching a maximum size (works like FIFO) – after a while the key is gone

Witold Baryluk • September 14, 2009 1:28 PM

@alexa search for tarsnap service.

Bob Gezelter • September 15, 2009 6:37 AM

Technology is often a double-edged sword. Vanish is by no means an exception.

In my July 31 article, “Vanishing Email and Electronically Stored Information: An E-Discovery Hazard” (available at: http://www.rlgsc.com/blog/ruminations/vanishing-electronic-data-ediscovery.html ), I noted two problems with Vanish: the problem of data leaving the system (e.g., cut/paste, screen scrape); and the problem posed by data self-destructing by inaction, rather than action. I particular, the problem arises with litigation holds, which are supposed to suspend normal document destruction policies.

There are many challenges when data storage is outsourced, some of which only become apparent when close attention is paid to possible needs and requirements involving the data, not all of which are always obvious on first impression.

100% guarantee • September 15, 2009 12:09 PM

keep you secrets on a notepad. You can burn that if you need too.

also…

I imagine that it would be much easier to fry a solid state drive, so its wiped forever, than an actual platter.

A little volt surge boxes attached to your solid state drive (would be neat) that will fry the internals to a crisp, with the flip a switch or a command line command, with 0% fire danger. about the size of a 9v battery. interface it through pci(x). hmm. or how about drives that will do this to themselves (automatically) if the wrong password is used 5 or more times, or if its forcefully unplugged without the password allowing this procedure it will fry itself also. my guess is in the future, there will be products like this.

Clive Robinson • September 15, 2009 5:19 PM

“A little volt surge boxes attached to your solid state drive (would be neat) that will fry the internals to a crisp…”

Sorry you will be extreamly disapointed to know it won’t work.

All you are likley to do is melt the bonding wires and possibly blow the output drivers.

The rest of the chip will still be readable to somebody with the right equipment.

What you need is a little aluminium powder and iron oxide and an electric igniter, then you’re “cooking”. Alternativly a replace the iron oxide with PTFE to make “flare material”, (but don’t breath the fumes unless a death by emphersemer is desired).

My personal choice though is a rewritable CD/DVD and a microwave oven (oh again with special exhaust vents unless you want to die an unplesant death).

Andy • September 20, 2009 8:46 AM

This really does look like a good way to share information with trusted parties.

I hope that this can become standard fare with all securely stored data.

Shazz • September 28, 2009 3:18 AM

Recently, some people showed that the current Vanish implementation has some security issues (basically that after 8 hours it is possible to reconstruct the message). Ok.

By the way, those students/reserachers claims that their goal was :
“Unvanish shows that the Vanish system does not provide the privacy guarantees it claims, by making Vanish messages recoverable after they should have disappeared. Our goals with this work are to discourage people from relying on the privacy of a system that is not actually private.”

As Jason and others commented on this post, the fact that the entire key set is stored in a publicly accessible key/value store is it really the problem ???

I’m not convinced that something “private” (my computer, a company, whatever) is definitively the solution as that’s a material target… Your opinion ?

Clive Robinson • September 28, 2009 5:19 AM

@ Shazz,

“Our goals with this work are to discourage people from relying on the privacy of a system that is not actually private.”

The problem with vanish is it is trying to do what is actually impossible.

Their aim put simply is to have an encrypted plaintext publicaly available and also the key publicaly available so that anybody can read it.

Which means that the plaintext can be recorded by anybody who has access to the publicaly available key.

Also anybody can with a little technical know how can also record the encrypted plaintext and the key in their own private file store.

The primary aim of the vanish project is to have the key “age” and effectivly become unavailable after a period of time.

But as I noted further up they cannot delete it from backup tapes or any other semi-imutable storage (CD-R etc) that it probably would have been recorded on automaticaly.

From this aspect it is very likley that the plaintext cipher text and key are going to be somewhere the question then is what is the probability of getting at them after the “timeout” period…

aikimark • October 12, 2009 8:06 PM

Unvanish links:
http://z.cs.utexas.edu/users/osa/unvanish/

http://z.cs.utexas.edu/users/osa/unvanish/papers/vanish-broken.pdf

Scott Chapman • January 6, 2010 12:23 PM

This is not really a technology problem.

You can put things on the Internet that you can’t take back. It’s like saying words. You can’t take them back.

An ounce of prevention is worth a pound of cure. This requires discretion. Discretion is in short supply. I read Bruce’s essay, “Zero Tolerance” Really Means Zero Discretion. This also assumes that people have discretion. I’m not sure that’s a safe assumption (clearly, many don’t have it).

Zero Tolerance means Zero Ability to Learn Discretion. If a person is never allowed to “open their clue book” then how will they “get a clue”? There’s something wrong with this picture. So discretion has to be learned in another venue. This is an over-simplification. More than discretion is missing here, and that may not be the venue to LEARN discretion, etc.

Now that we have the Internet, people’s lack of discretion is more evident. You need to “look both ways before you cross the street” – think carefully before you post something on the ‘Net. Once it’s out there, consider it to be irrevocable public property.

I don’t think any form of technology will be able to solve the problem of a lack of discretion. I doubt technology will be able to adequately deal with the symptoms of the problem. It’s not a technical problem. It’s not a legal problem, either. Zero tolerance laws clearly don’t solve the problem. Other laws that attempt to solve these types of problems work about as well.

People need to use discretion in trying to solve problems. It may not be their problem to solve.

Tamarind • May 24, 2010 4:00 PM

Why has no one created a user friendly GUI application to securely delete user created files or slack space on Linux OSs…

testbeta • June 6, 2010 2:52 AM

what do you think of Eraser?
http://sourceforge.net/projects/eraser/
BCWipe isn’t free!

at sourceforge page for Eraser it says:
“It completely removes sensitive data from your hard drive by overwriting it several times with carefully selected patterns.”

Taunter • July 6, 2010 9:02 AM

@Tamarind – Linux GUI… why does that just sound so wrong to me? Oh, yeah, I started using computers when ‘GUI’ meant 132 column printouts.

@Tom, waaay up there – No, you do NOT own the data, read the user agreement again. The moment you click on the ‘Post’ button, it’s theirs. So if you want it to remain yours, don’t. It’s that simple.

It amazes me how many crybabies there are in the younger generation about this issue. Meaning younger than 35, nowadays. It’s a computer, it does what you tell it to – not the other way around. Kudos @bob above, who took responsibility for himself and the time to read the fine print, and subsequently did not get screwed – unlike all of you whiners who will be continuously bent over for the rest of your lives. The end result of this is people wanting government to solve their personal mistakes, and thus is the primary reason we have out-of-control huge government invading every aspect of our lives. Is responsibility truly dead? If you don’t want some people knowing, then don’t post! It really is that simple!

BP • November 26, 2013 6:41 PM

Bruce,
If you’re too busy, get someone to update your wikipedia page to reflect your duties on the board at EFF. It’s not on there yet.

Rock1 • April 18, 2020 6:47 PM

It is always good to read and learn new information. Keep writing such good and informative articles so others can benefit from it.

File Deletion

Comments

Leave a comment Cancel reply