Using Language Patterns to Identify Anonymous E-Mail
Interesting research. It only works when there’s a limited number of potential authors:
To test the accuracy of their technique, Fung and his colleagues examined the Enron Email Dataset, a collection which contains over 200,000 real-life emails from 158 employees of the Enron Corporation. Using a sample of 10 emails written by each of 10 subjects (100 emails in all), they were able to identify authorship with an accuracy of 80% to 90%.