Schneier on Security
A blog covering security and security technology.
« Quantum Cryptography |
| Replacing Alice and Bob »
September 26, 2012
Using Agent-Based Simulations to Evaluate Security Systems
Kay Hamacher and Stefan Katzenbeisser, "Public Security: Simulations Need to Replace Conventional Wisdom," New Security Paradigms Workshop, 2011.
Abstract: Is more always better? Is conventional wisdom always the right guideline in the development of security policies that have large opportunity costs? Is the evaluation of security measures after their introduction the best way? In the past, these questions were frequently left unasked before the introduction of many public security measures. In this paper we put forward the new paradigm that agent-based simulations are an effective and most likely the only sustainable way for the evaluation of public security measures in a complex environment. As a case-study we provide a critical assessment of the power of Telecommunications Data Retention (TDR), which was introduced in most European countries, despite its huge impact on privacy. Up to now it is unknown whether TDR has any benefits in the identification of terrorist dark nets in the period before an attack. The results of our agent-based simulations suggest, contrary to conventional wisdom, that the current practice of acquiring more data may not necessarily yield higher identification rates.
Both the methodology and the conclusions are interesting.
Posted on September 26, 2012 at 7:11 AM
• 7 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
"contrary to conventional wisdom, that the current practice of acquiring more data may not necessarily yield higher identification rates."
Only, I suspect, if you define "conventional" = "political". I don't think it came as a surprise to anyone else
@NobodySpecial - Agreed, although "political" tends to cancel out "wisdom." I guess enough people believe politicians to make "political" "conventional."
"...contrary to conventional wisdom, that the current practice of acquiring more data may not necessarily yield higher identification rates."
Amen to that. Now if the NSA, who is now in the process of adopting the use of higher risk connections and devices, would only relent and stop attempting to collect ALL OUR DATA at their Utah facility. In addition to having a hard time collecting that data (network impact), they probably are never going to use it all anyway.
Hey! That's not such a "new paradigm." It's in line with my years-ago grad school work.
Of course, my name is not Claude Shannon so no one would ever think of reviewing such things.
The paper is nice as a thought experiment.
The obvious strength of a numbers-based analysis with real world statistical data is somewhat weaker than the author proposed due to a few flaws.
1) Data about telephonic communications from 2005 is inherently different than current use of VOIP, SMS, and IP anonimizers.
2) There is no method provided to distinguish between a "cell" that wishes to cause harm vs. a "cell" that wishes to simply remain secret. (A surprise bachelor party vs. a public harm)
3) The privacy issue is tangential to data retention methods and analysis. But in this case the "return on investment" calculation includes invasion of privacy as a "cost". (More on this at the end of this overly long comment)
4) In an increasing digital world, low tech solutions are in danger of becoming extinct. The author never mentions the effectiveness of metadata analysis aided by human intelligence (HUMINT), psychological profiling, discrete event effects or forensics. In crypto. terms this could be like a brute-force attack vs. a dictionary word attack.
5) Statistical node behavior inputs may not be the best inputs for a discrete event such as an attack. From the standpoint of a communications network, the american football championship can look the same as a hurricane in Florida or mud slides in California. If the network analysis is not overlapped with geography and events, many random variables are introduced into the system. False positives or contextual biases can occur as well.
6) If I understood correctly, the author is saying that the amount of data being retained and monitored is not necessarily beneficial. And that these measures were implemented without thought to their effectiveness. I agree completely even though the methodology used to proved it is quite flawed.
7 conclusion) In order to analyze social content, the inputs should come from socially generated data. Fore example: Hit counts on a video sharing site or "likes" on a public site have far more publicly available relevant data on trends than telecommunications metadata. Privacy is a non-issue, any measure to invade it can be circumvented and these measures can in turn be cracked and so on in a never ending digital arms race. The policy of invading everyone's privacy in order to identify a handful of individuals is up for debate. Aren't we all potential threats to someone? What is the difference between "individualistic" to "subversive"? What is preventing those who wish to do harm from using this information to their own means?
These are the questions that are not answered by agent based simulations.
End of line.
3 note) The issue of privacy is interesting because each person will determine the the impact differently. Rock stars allow for a certain lack of privacy owing to their success and maybe even use it as a marketing tool. Conversely, some people would be offended even if their email was made public.
In a large network of communication between individuals, the issue of privacy is more complex. It is easier for an individual offence to be overlooked in a larger data set. This is even more of a problem if this individual takes steps to hide the offence.
The retention of metadata on all of us can be beneficial for those who want to harm us because there are too many ways to subvert the data stream. Using a "burner" cellphone, unprotected public resources (eg: library and coffee shop computers) and simple subversive techniques can hinder a digital footprint analysts.
This brings back memories. Many years ago I worked in the seismic survey industry and similar techniques were used in signal processing to extract sound energy reflections from rock layers.
The problem was that the reflected energy had surface wave energy convolved into the signal. Homomorphic deconvolution was used to remove the surface wave. Using a source recording of the surface wave, then Fourier transforming the signals made the convolution a multiplication, logarithm made it an addition, so subtracting the surface wave, anti-log and anti-transform gave you a clean reflected signal.
Noise was addressed by stacking multiple recordings, noise is random, so adding noise signals together will make it tend towards zero, the signals (if aligned correctly) will reinforce. Not sure if the noise in the homomorphic crypto is deterministic, if so, stacking wont work.
Amazing to see what I learnt as a signal processing technique applied to cryptography
The conclusion in 3 sentences:
1. Recent activities are more predictive that older activities.
2. They probably know that, but are saving older data anyway.
3. Why? Data mining for retrospective or selective prosecution/persecution.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.