Schneier on Security
A blog covering security and security technology.
« Comprehensive National Cybersecurity Initiative |
| Another Interview with Me »
March 5, 2010
Mariposa Botnet Shut Down
The Spanish police arrested three people in connection with the 13-million-computer Mariposa botnet.
Posted on March 5, 2010 at 6:02 AM
• 54 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
A single bot is worth around $15 (2007 dollars).
Jason Franklin, Vern Paxson, Stefan Savage, Adrian Perrig. "An Inquiry into the Nature and Causes of the Wealth of Internet Miscreants". ACM Conference on Computer and Communications Security (CCS), ACM, November, 2007.
A botnet of 13 million computers would be expected to be worth almost $20M. At a ROI of 2% this botnet should make the owners around $400k a year.
It seems the people behind it lacked the skill to market the botnet. Maybe they lacked even more skills as they were caught.
Another question would be what fraction of the bots would be running Windows?
100%, or more?
Sorry for over posting, but I forgot something.
"The first member of the gang was arrested in early February, when he inadvertently logged into the network without disguising the address of his computer. "
Security is hard, even for a botnet owner.
But Foxyproxy is not that hard to configure?
For this type of operation, I would use a special browser, hardwired into Tor.
I'd use an entire virtual machine, on a thumb drive, encrypted with plausible deniability, using a linux guest with twm, tor privoxy and firefox with the user agent changed to be one of the search bot engine
Sorry if this is a dumb question, but what happens with a botnet once the people behind it are arrested ? The computers won't be cleaned of the malicious code. Is it imaniginable that someone takes control over it ?
13 million PC's.
Majority of top 1000 Companies.
Infection via Net or Memory stick.
All by "hackers of less than average ability"...
Now imagine if you can what would be the case if the BotNet is covert with near infinite One time Use command channels (I've described how to do this before)
If we look at it they actually made more than one mistake.
The first was their bot net was not covert. This was a business choice they made.
It was recognisable by,
1, it's control channel.
2, it's upload channel.
3, It was rented out for DoS etc.
They where caught not because of this or their money making techniques but by a simple avoidable error.
I don't know about you lot but I think we have a real problem here...
If you look at the two infection vectors it was set up as a "fire and forget" system not a "directed" system. Personaly I think "directed" attacks have had their day others might disagree (to which I would say look at the stats).
It was capable of crossing "air gap" security to infect machines on issolated networks.
What is not clear is if they worked out a way to get data back across the air gap (as the old saying goes "what goes up must come down"). Getting the data out is technically slightly easier than getting the infection in, in the first place. Which leves getting the control channel across the air gap, again not an overly difficult technical task (the virus got in).
Now imagine such a system was run not by crooks but a Government.
They would have no need to have the Bot Net pay it's way so it could just sit their quietly gathering info on the PC concerned.
When the control channel sends an appropriate check match (such as more than 30% of mail box adressess are .gov or .mil) on the control channel along with send an ID string and mail inbox address. This low bandwidth covert up load channel can then be used to identify speciffic machines in a ver covert way.
Thus it can be used to identify a single target machine and slup out PDFs and Doc files as though the attack had been originaly directed...
Not all money needs to be made by stealing ID or CC details etc. Sometimes copies of files are way way more valuable and the owner of the file may never know it went missing...
I'm waiting to see the rise of the covert "Info Broker" botnet.
The trouble with figures such as,
"A single bot is worth around $15 (2007 dollars)."
Is they have no real meaning.
Let us assume that there was a mythical company called DieBlow that made voting machines. And that they decided to use "signed code updates" installed from maintance personel machines (Wintel Laptops).
What sized RSA key could be broken on 13million PC's in say 12 hours a day "slack time".
What's the chance more than 50% of the maintanence personnel's laptops could be identified and subverted?
Finall question what price could you get to buy the presidancy of the USoA?
Something tells me atleast 50USD / machine or 650million USD is not unrealistic for some political parties to raise from "spare change" of the offshore companies their supporters own...
400k divided between 5 people is 80k a year. This is a maximum. I'd be suprised that they made as much.
I love reading the detailed analysis of the bots.
"The trouble with figures such as,
"A single bot is worth around $15 (2007 dollars)."
Is they have no real meaning."
If you rent out a botnet, you will get a price per bot.
It is in the paper I mentioned. The authors of the study tried to determine these prices from real bids and came up with fluctuating prices, between ~$5-$20.
So, $15 might be an over estimation, and I should have written $200M for the total price and $4M per year return.
But even at a very low $1 per bot, the botnet would have been worth around $13M. And even a low 1% return would then make $130k per year.
Clearly, the Mariposa botnet did not get the owners that kind of return. Maybe the botnet was simply too big for this market to get a good return.
Winter, rent is per unit time. If a bot rents for $1, over what period? (Typical bot rental prices are around $.01/day, +-50%.)
You can't have a "2% of value annual revenue" on something that lasts for 5 years.
I love reading detailed analysis of botnets too.
@clive, I love your posts here, and am glad you choose to only use your knowledge for the good :)
For 13 million bots, it would seem the efficiency of the total network was low. Thus the disillusioned expectation that more bots get more money. When the return approaches zero, we should see less of this but certain groups are only interested in the stolen data and access, not the money.
"Winter, rent is per unit time. If a bot rents for $1, over what period? (Typical bot rental prices are around $.01/day, +-50%.)"
Sorry, the prices quoted are sales. That is, price for transfer of control per bot.
Rent prices could be per run, e.g., one batch of spam, or for a time. But I have no idea whether these rent prices are known. I do not remember whether these were discussed in the study I refer to.
Therefor, I speculate about Return-on-Capital. For an asset worth $13-200M, you would expect a ROC as a percentage, that is, something in the range $100k-4M per year for a 1-2% ROC.
Either their market is simply not large or liquid enough to supply such a ROC. I think their market failed largely because of transaction costs (finding and convincing "honest" clients).
Return on capital for a wasting asset has to include depreciation. A bot worth $15 with a 5-year expected life can't rent for less than $3/year or it wouldn't be worth $15.
The prices I mentioned are available from a google search; confidential information I have access to agrees with that estimate.
I think that you misunderstood what I was getting at.
The value of 13million bots is way way higher than the current market price. That is even a one off transaction could if carried out the right way net 0.5BillionUSD at quite low risk to not only them but the botnet as well.
I note from your reply to Seth,
"Either their market is simply not large or liquid enough to supply such a ROC. I think their market failed largely because of transaction costs (finding and convincing "honest" clients)."
That you are thinking similar thoughts.
If it was me I would be looking at having a "covert cloud" ontop of the botnet doing highly parellel tasks that have a high dollar value either directly or indirectly.
For instance what is the value of the private key to sign code for protected devices (mobile phones / DRM etc).
There are a whole host of other things like this that don't realy put the bot net at risk or for that matter the operators.
Using such a resource for spam or DoS shows a real lack of ability in the thinking department for these people.
Oh and they don't have to find "honest clients" but likewise with a little thought the bot net could be laundered for certain types of "cloud" activity if they went about it the right way.
'More than 13 million PCs, in 190 countries in more than half of the world’s 1,000 largest companies and at least 40 big financial institutions.
The virus was programmed to steal all login details and record every key stroke on an infected computer.'
... if these were of less than average ability the more successful bot by the clever guys must still be in play...
And yet, just a few weeks ago, experts on this very forum claimed that taking out a zombie network in that fashion was impossible.
And yet it was done.
"Return on capital for a wasting asset has to include depreciation. A bot worth $15 with a 5-year expected life can't rent for less than $3/year or it wouldn't be worth $15."
"Oh and they don't have to find "honest clients" but likewise with a little thought the bot net could be laundered for certain types of "cloud" activity if they went about it the right way."
Obviously, we agree that these people were unable to capitalize on their botnet.
So this story only helps us to get more frightened about all the botnets that are herded by competent people.
"And yet, just a few weeks ago, experts on this very forum claimed that taking out a zombie network in that fashion was impossible."
Who said that the botnet was taken out?
@ Brandioch Conner
"And yet, just a few weeks ago, experts on this very forum claimed that taking out a zombie network in that fashion was impossible.
And yet it was done."
Err no it has not in any way been taken out. All that happened was the control channel was disrupted so the bots where not getting instructions.
According to one artical shortly after the control channel was disrupted one of the bot operators bribbed a person working for an IP registry and got part of the control channel back and then turned those bots onto one of the organisations as a DoS taking out not just the company but the ISP and all the other companies that used the ISP.
Worse is Spanish law does not actually prohibit what has been done, so the gentalmen concerned are out and about and free to download the rest of the data they aquired and stashed away.
Realisticaly it looks like they might not even go to trial let alone see the inside of a jail. The Police are looking for evidence of a crime (such as Fraud) to actualy charge them with and from what has been said it looks like that might not even happen.
Which leaves the point so delicatly put by @ sebD,
"Sorry if this is a dumb question, but what happens with a botnet once the people behind it are arrested ? The computers won't be cleaned of the malicious code. Is it imaniginable that someone takes control over it ?"
In this case the original bot net owners got some of it back.
However the rest of the net is still (in most cases) sitting their waiting for instructions via the control channel.
There is of course the question as to if the operators put a "deadman's switch" in.
As it was a "fire and forget" that had a vector to get across air gaps I doubt that it has a switch and if it has I again doubt it has a "take the world down with it" payload.
So the chances are some of the bots will sit there looking for the control channel untill the "owned PC" joins the "great bit bucket in the sky".
"So this story only helps us to get more frightened about all the botnets that are herded by competent people."
Not just "herded" but "hidden" as well.
After some thought on the matter a while ago triggered by hearing some researchers self importantly "blowing on" about how they where finding and measuring bot nets therfore they where not realy an issue as they where only used for Spam and DoS and we had mitigation for those.
I sat down and thought about how I would go about it and how I would protect the investment whilst capatalising on it.
Lets be honest Spaming and DoSing are little more than babies scribbling on the paint work with a crayon. It's easily visable and easily removable/painted over.
I've got a little more technical knowledge and a slightly more inventive mind and let me put it this way,
Let's just say you don't want to know the potential of "covert bot nets" I thought up if you want to sleep at night.
And I'm sure there are people with better knowledge and imaginations than my old noggin has these days.
@pj Who says Clive only uses his knowledge for good?
"Who said that the botnet was taken out?"
That would be the articles that Bruce had linked to. They can be found in the story at the top of this page.
"Let's just say you don't want to know the potential of "covert bot nets" I thought up if you want to sleep at night."
Yes we do! :)
@winter "these people were unable to capitalize "
Two drink minimum.
@Brandioch: That article was a little over-simplified (and overoptimistic, to boot).
@ Clive - If you take the time to read the full report and news articles, you would see that you are quite wrong. The botmasters temporarily regained control of Mariposa - only temporarily. They are not now, in any way, in control of any part of the botnet. The Mariposa Working Group did not simply disrupt the communication - they took over the command and control structure of the botnet. Yes, the affected computers are still compromised with malware but the people responsible for Mariposa no longer have access via their command and control structure.
It makes me think that the US building one free, medium assurance OS would pay itself off. We mainly focus on securing our computers to protect our assets. One thing botnets and their strength in numbers bring to the forefront is that securing our systems may mean securing *their* systems too. Their means any vulnerable user with a decent connection. A Linux compatible OS might be a decent start, with kernel extensions to reduce attack surface, a small hypervisor that just maintains kernel integrity (i.e. SecVisor), and a saved, signed trusted state for particular applications in read-only medium. There would be a few certified hardware platforms: desktop; workstation; server; notebook; netbook. I think this would be a few million well spent reducing our attack surface, esp. if open-sourced, analyzed and distributed by foreign sources so foreign users don't have to totally trust USA.
"If you take the time to read the full report and news articles, you would see that you are quite wrong."
First do yourself the favour of going back and reading what I wrote will you?
But your use of "quite wrong" suggests you are on a PR spin excersise for the MWG so lets go throught it.
"The botmasters temporarily regained control of Mariposa - only temporarily."
'Err no it has not in any way been taken out. All that happened was the control channel was disrupted so the bots where not getting instructions.
According to one artical shortly after the control channel was disrupted one of the bot operators bribbed a person working for an IP registry and got part of the control channel back and then turned those bots onto one of the organisations as a DoS taking out not just the company but the ISP and all the other companies that used the ISP.'
What part of this is untrue?
Please note, the articals where not clear just how "temporarily" it may or may not be (note deliberate use of present tense).
And they made no comment as to what else the Bot net Operator did with those machines during that time such as maybe putting more malware onto some of those machines as a seed to a new botnet (thus use of present tense).
Hence unlike you looking through rose tinted PR specticals, I have cautiously left it open.
"They are not now, in any way, in control of any part of the botnet."
Sorry that is not a claim you can justifiably make, there are aproximatly 13million machines out there how can you know what their state is at any one point in time. Some where out of the MWG control for a while and the wake up call was part of the MWG's ISP got DoSed. The botnet operators or others currently unknown (there may be more than three, and probably more than three who know how the control channel works) may well have taken over some of them briefly to put on new malware you don't know they have not and you cannot watch all 13million (or however many are left) so your stament is of little vaule and sets a possibly false impression that may be more harmfull.
"The Mariposa Working Group did not simply disrupt the communication - they took over the command and control structure of the botnet."
So if your statment is true you are confirming that others know how the channel works and thus others can work it out and do the same
Thus you are only "tentativly" in control of it untill somebody who knows how it works goes and bribes somebody else, etc, etc....
You lost control of part of it once unless you take other corrective steps it can happen again such is the nature of the way things work.
Or would you care to give reliable verifiable technical reasons as to why it's not going to happen again?
Oh but hang on you partialy confirm what I've said with,
"Yes, the affected computers are still compromised with malware"
And go on with,
"but the people responsible for Mariposa no longer have access via their command and control structure."
That is pure supposition on your behalf, the MWG lost part of the botnet once and in all probability it can be done again ad nausium. Don't repeat the mistake, just for a bit of PR Spin.
Because you fail to mention your hold on the command and control structure is at best "tentative" for a whole host of reasons not just "bribary".
So put your PR spinner back in the box please.
Giving the FALSE IMPRESSION all is well when in all probability it is not, is not doing the MWG or anybody else any favours. In fact it gives many the excuse to do nothing which is much much worse.
The fact is a lot (figure currently not known) of the machines of this bot net are still sitting their waiting for comands and they will do so in many cases untill either proactive action by the machines owner or they get taken off of the Internet.
The exception to this is if there is some way the control channel can be used to modify the malware so it becomes inefectual in some way.
The only problem with this is that in a number of juresdictions (the UK being one) you would be commiting a crime if you did so. So Catch 22, damed if you do damed if you don't...
Thus all anybody (who knows how the control channel works) has to do is get between the MWG faux controller and the infected machine and they can, by usuing a number of well known and less well known tricks get controll of those down stream machines. Thus the closer to the MWG faux controller they do it the larger the number they get.
Would you or somebody else from the MWG care to comment honestly?
Complacency and Spin will only encorage those who should be taking action to put their house in order to carry on sitting on their thumbs...
@ Nick P,
Carefull what you wish for ;)
I have in the past noted that you would be safe using a non mutable (ie CD based) OS and no sem-mutable memory. Thus power down and any problems go away.
Well for various reason we have discussed I was "over optomistic" on the degree of "safe" and should have said "posibbly safe If, you also..."
The one thing this bot net did was use a very old attack vector of virus style infection from removable media....
Thus it could cross the "air gap" from your Internet PC to your private PC and get at your confidential information...
Thus it only required two other things,
1, Getting control information across the air gap.
2, Getting the confidential information back across the air gap.
Humans are creatures of habit so if you cross the air gap once the chances are you can do it twice etc, so getting control data in is almost easier than getting the malware in.
Oh and if you think about it as long as the removable media is infected the "Internet" PC will just get reinfected the minute the media gets pluged in again...
Thus the hard part is hiding the confidential data going back to the Internet either on the removable media or the low(ish) bandwidth comms link.
Saddly not all (read most) users can be educated to this fact at a level where it modifies their behaviour.
But there is the issue of AV and zero day...
Malware writers are getting smart faster than almost everybody else.
If they first stop and think how to capitalise their malware properly (ie not Spam or DoS) and make the bot net properly covert then we are potential going to "be living in" some seriously "interestingly times"...
I just wish a few people would wake up to the fact and start thinking seriously about it instead of blowing their trumpets and resting on their laurals.
@Nick P: someone is working on it, just not in the US (and the goal is not just medium assurance).
> Who says Clive only uses his knowledge for good?
Who says Clive isn't totally talking out of his ass?
I'd like more specifics. Open-source people have been working on this for decades, but have produced jack. There are already high assurance proprietary solutions that I'd trust with my life, but not against backdoors. ;) If you want to see them, then retrieve an older post where I link to all of them by typing this in Google: "schneier" "Nick P" "Integrity" "Turaya". Maybe adding "MILS" would help, but I have a ton of links and mentions on a few pages of real things. I'm aware of many open-source foreign projects, including those that succeeded like seL4, Nova, Nizza & Turaya. We'd all be better off if you mentioned the name of your sources, as we could evaluate their prospects. I always do.
I sometimes wonder. He's undeniably very knowledgeable and brilliant. Much of what he says in esoteric fields checks out from my own diverse 10+ year background. Determining the proportion of Clive's comments that are wild ass guesses is an NP-complete problem. ;)
Yes, any information flow between the systems can be a problem. The best we can do is only allow certain types of data and filter them. Thanks to the military, there are tons of these "cross-domain" security solutions that have proven quite effective. But, will these things work out to protect ignorant users against sophisticated malware developers? I have doubts. My solution to this problem in the past was multi-pronged: limited file formats; verification that it is in fact that format; trusted or very isolated, execute-only viewers, esp. for PDF & media. I couldn't solve proprietary format issues like Word, so I just said to avoid them for others. Besides, RTF is so much more space-efficient & easier to parse. ;)
There is a bigger issue, though. You should have noticed it first when thinking of booting from read-only media. When doing this, one implicitly trusts the platform booting it. Malware authors have successfully attacked each element of this platform: processor bugs; firmware (general); BIOS; cache; PCI/IO subsystems. This is before one even considers the kernel or drivers running in kernel-mode! The attack surface is f***ing huge! Any attempt to make a mass market, medium assurance system will need to solve these problems. The solution will also need to be somewhat seemless. Here's a partial I've been toying with: Intel Core Due Extreme (least bugs); quality firmware w/ tweaks to deal with processor bugs; TPM. The Perseus Architecture (see Turaya) is used. The firmware loads a trusted microkernel, services, & drivers first. They use TPM to unlock keys used to load/verify a Linux/Windows VM. Critical signed updates could be hot-loaded, but the main form of update would be getting a new RO memory medium. As in MILS, security-critical services are outside & information flow is very explicit.
It's not high assurance, though, as the kernel & trusted functions aren't as verified & covert channel reduction not as strong. More than good enough as a seemless, security improvement for average user. Could eliminate read-only media with a PC made to work with high assurance kernels, like INTEGRITY on Dell Optiplex. Then, the software would be trustworthy enough to ensure execute- and read-only file operations. The medium assurance version, though, is a nice stepping stone & wouldn't require a serious hardware conversion for the better medium assurance version that features high assurance components.
@ Nick P,
"Determining the proportion of Clive's comments that are wild ass guesses is an NP-complete problem."
Hmm not sure myself, but rumor has it I am human so I must be fallible 8)
Seriously though, even what appears "wild ass" is usually verifiable by simple logical steps, that are usually well within others comprehension. It's just that they don't see the connection untill you walk them through, then you get that "well that's obvious look".
With regards the issues,
"Yes, any information flow between the systems can be a problem. The best we can do is only allow certain types of data and filter them."
What it boils down to in the end is not allowing external data to become code
Easy to say but...
"Thanks to the military, there are tons of these "cross-domain" security solutions that have proven quite effective."
Yes and in general they have a large chunk of "Not User Friendly" attached to them.
"But, will these things work out to protect ignorant users against sophisticated malware developers? I have doubts."
I'm not sure that there are actualy that many "ignorant users" out there. The problem is better looked at one of the "utility of resources". That is a user regards a computer in the same way they do their car or mobile phone, they have a function to make their life easier, such that they can get better utility out of their time.
If you think about it even your averagely smart malware writer is well well ahead of the game. And almost the entire security industry is having to fight managment to even have a voice, let alone resources.
Mainly because managment's view is not one of security but "liability" limitation/mitigation/externalisation, which means "safe doors on tents" because SabOx / PCI / HIPPA et al are effectivly perscriptive not advisory, mainly so auditors can have tic box checklists.
And to be honest I'm not sure of a way in the current climate of how you would jump managment out of the view. Especialy with "self regulation" and those auditing dependent directly for their income from those they audit.
Most security products get designed to meet the tic the boxes on the auditors checklist, irespective of if it makes sense or not (BF Skinner's example of two passwords = two factor authentication... being a prime example).
Any way there are still some people out there that realise the difference between faux checklist security and actual security and are actually prepared to devote a few resources to it.
"My solution to this problem in the past was multi-pronged:"
Always a good way especialy if you get 100% or more overlap from area to area.
"limited file formats; verification that it is in fact that format;"
Ouch, not easy at the best of times and with the likes of XML it is a hairs bredth from being interpreted code.
"trusted or very isolated, execute-only viewers, esp. for PDF & media."
That is almost game over as much media these days is effectivly interprated code not data
"I couldn't solve proprietary format issues like Word, so I just said to avoid them for others."
Yes most MS files are alowed to contain either code or linked/hidden data that can be used as code by something else (ie IE components).
Besides, RTF is so much more space-efficient & easier to parse. ;)
Hey how about WordStar 4 8)
"There is a bigger issue, though."
Saddly there are many, not all of them obvious even with 20-20 hindsight.
"You should have noticed it first when thinking of booting from read-only media. When doing this, one implicitly trusts the platform booting it."
Put simply "anything semi-mutable" and it's potentialy game over.
"Malware authors have successfully attacked each element of this platform: processor bugs; firmware (general); BIOS; cache; PCI/IO subsystems."
Yup and it's going to get worse with time with "PnP interfaces" etc. DMA from I/O is just plain nasty
"This is before one even considers the kernel or drivers running in kernel-mode! The attack surface is f***ing huge! Any attempt to make a mass market, medium assurance system will need to solve these problems."
My viewpoint is actually go for much finer granularity than the kernal. That is think parellel processing and scripting, with functions running in jails with signitures on not just the function but it's IO as well (my Prison -v- Castel view).
"The solution will also need to be somewhat seemless."
That's the big problem irrespective of which way you go, it's almost always going to end up a "start from scratch" approach.
"Here's a partial I've been toying with: Intel Core Due Extreme (least bugs); quality firmware w/ tweaks to deal with processor bugs; TPM. The Perseus Architecture (see Turaya) is used."
TPM has it's issues as I'm sure you are aware (and no I'm not talking the DRM debate for others tagging along on this).
"The firmware loads a trusted microkernel, services, & drivers first."
Bang, how does the firmware know it's not running in a virtual machine?...
(Hence my fondness for talking about "limited resource jails" and hardware hypervisor checking "resource signitures" (such as time etc).
"They use TPM to unlock keys used to load/verify a Linux/Windows VM."
Ok as long as the keys have usefull meaning (in some systems they actualy don't, and in others the keys are weak for various reasons). It is very very difficult to get 90% right and is very viewpoint and assumption dependent.
"Critical signed updates could be hot-loaded,"
As you know I'm no real fan of "signed code" it actually means very very little in practice. Essentialy all it means is "Fred Smith has locked a binary file" on such and such a date etc. It in no way says anything about the quality of the binary file.
"but the main form of update would be getting a new RO memory medium."
It has the same problem.
I'd like to see "execution signitures" as well whereby any code can be externaly monitored and checked whilst it is executing. But this means significantly reducing complexity to make the signitures more easily checked.
But I know this not going to happen this side of Xmass (2020 8(
"As in MILS, security-critical services are outside & information flow is very explicit."
Thus stepping towards 2020 ;)
"It's not high assurance, though, as the kernel & trusted functions aren't as verified & covert channel reduction not as strong."
It's actually about as far as you can go on a "single system" without aditional external monitoring.
"More than good enough as a seemless, security improvement for average user."
"Could eliminate read-only media with a PC made to work with high assurance kernels, like INTEGRITY on Dell Optiplex. Then, the software would be trustworthy enough to ensure execute- and read-only file operations."
The hardware is still the Achiles heel though and DMA is a no no as is direct IO of any form.
"The medium assurance version, though, is a nice stepping stone"
It would cut a lot of the problems we currently see (known knowns) and quite a few we can forsee (known unknowns). However I suspect malware will "evolve" is such a system gains traction.
It's a bit like the pointless "Linux has less malware than MS" argument. The reason is more to do with market share than technical superiority. You shine a powerfull enough light on a steel plate and eventually it will get through it's just a matter of time, determination and incentivisation.
And of course there is always going to be that user to do a handy end run around the technical security...
Which is where we started.
So step 1 on the master plan,
1, Eliminate all users...
Ah, I like the response. At first, I was thinking: "he had to respond to everything point by point..." but then I realized it might be easier for other readers to follow than your usual essays. That is, the organization scheme and context are readily available. Ok, so my response.
"not allowing external data to become code"
Don't think it's that simple, although that's a nice start. I think the rule is more abstract: not allowing external data to cause unauthorized behavior. This covers cases where the intentionally malformed data causes unwanted effects, although the system never classifies it as code because it doesn't actually execute. It just causes side effects.
On "Not User Friendly" cross-domain and user ignorance...
We are on the same page on Not User Friendly: that's why they'll usually fail except in certain careful corporate implementations. As far as ignorant users, I assure you they exist and in mass. When I got laid off from one security-related job, I ended up on self-checkout at a retail store. The system was ingeniously designed from a usability perspective: one basically didn't have to think to operate it. However, there were always a steady & very significant percentage of people who would do incredibly stupid things. These things were so stupid the other customers laughed when told about it. Even when told the problem, many would continue to try the same thing. So, I think the people and situations you described are definitely correct, but we can't rule out people doing dumb things regularly. If the percentage I saw is indicative of national techno-dumbness, then there would be over ten million people whose dumb decisions would destroy a mass-market cross-domain solution.
on format "game over... much media... is interpreted code not data"
On low-level hardware issues...
Yes, Plug-n-Play has become more than an interface: it's a philosophy solely rooted in usability. That's nice until security is concerned. I've often joked it should be called Plug-n-PRAY. ;) We still haven't found common ground on the castle vs. prison view, as you call it. All this signature tracing, special IO formats, parallel... I don't see regular code monkeys doing it right or even caring. However, we have built castles that lasted, like Gemini Network Processor for confidentiality/integrity or IBM mainframes for availability and certain degree of isolation. We even have a platform that verified the processor (AAMPG7), firmware, kernel (INTEGRITY-178B), and trusted components. Like Gemini, they took a layered and secure composition approach that's proven itself over time.
The most recent addition to castle model is MIT's Aegis secure processor. It doesn't trust the RAM, IO, etc. It's a lot like TPM in the use of root of trust & secure load & attestation. However, the loading and trust starts inside the processor itself and it keeps hashes of external RAM and IO stuff, checking it regularly. It can also verify signed updates to critical portions. It has onboard cryptographic acceleration and does all of its trusted comp. functionality with just a few extra instructions. If one wants to build the ultimate next-gen castle hardware, they find a better CPU. I'd like a similarly trusted IOMMU, though, with authenticated communication with the CPU. I can't imagine all of this having a high rate of speed, though. Maybe start with embedded stuff.
Additionally, we could combine my Castle and embedded server board ideas with your prison view to use a bunch of small, embedded Aegis-style chips running in parallel. The totality of these, along with a Master node, become the system. One node would be the GUI or user-facing portion of each app, with the others acting as a compute node with trusted boot, load, and (when done with app) purge. Servers would be the first useful application, but one could put a bunch in a desktop tower with KVM for a secure PC. ;)
"how does the firmware know it's not running in a virtual machine?"
Similar to the Chinese room argument or those that say it's impossible to tell reality from a granular-enough simulation if reality is simulatable. Hardware hypervisors can be probed for keys and then simulated. At some point, we have to trust something. If we wanted a medium assurance, it would start with the CPU itself in some hardwired bootloader. It loads and verifies the initial firmware. Then, the firmware loads any signed updates, using the hardware verification function. Once firmware is loaded up, then the OS can be loaded similarly or with a regular TPM. The processor is the root of trust. It must work. The TPM just builds on it. TPM is useful because there are several inexpensive chips on the market, along with plenty of software support even for high assurance vendors. Again, it's just a building block, not the root of trust.
"DMA is a no no as is direct IO of any form"
Hence the IO MMU. There is one in vPro chipset, which that Dell Optiplex has. It's MMU+IOMMU+TPM+IntelVT+securekernel = much_improved_software_security. The only worry there is implementation flaws, which vPro had from the go. The IOMMU's restrict DMA device access to certain memory regions. Trusted firmware, secure kernel, medium assurance IOMMU, and carefully orchestrated protocol for granting/revoking access to memory regions solves this problem if we use castle approach. Device firmware that has access can be subverted at some point, but MILS approach only gives it minimal access to begin with. A random PCI card's subverted controller doesn't read the RNG's internal entropy state. It's either an IOMMU or custom, slower, non-x86 CPU with baked in security. Which will users pay for?
"It's a bit like the pointless "Linux has less malware than MS" argument."
I don't think so. We're talking the difference between low-defect development approaches of highly modular, predictable software with standard software development approaches of mixed-up, complex software. It's "correct by construction" vs. "penetrate and patch." Penetrate & patch gives quick time to market, plenty of features, maybe low cost, and a ton of bugs. Correct by construction give only core feature set, slower time to market, cost increase, and little to no bugs. The Linux vs. Microsoft argument is about incentives to attack two software that are considered similarly buggy. I mean, sure more attackers and incentives will result in more existing bugs found in low-defect software. The point is that medium assurance, low defect software has far fewer latent bugs and, as experience shows, most of them have minimal (or unnoticeable) impact. Let attackers move from hitting Linux platforms to BAE Systems' XTS-400 platforms: they aren't going to get far trying to subvert XTS-400's TCB with infected Word documents or automated worms. They'll stop at user-mode data consumers and integrity mechanisms might even prevent that from going far. Like med. assurance software in general, it's architecture, development methods, testing and maintenance are just that much better.
"So, step 1 on the master plan: eliminate all users."
If I said I was with you on that, it would be considered conspiracy to commit genocide. So, I'll say it's a great idea that I [cough] would never want to see happen [cough]. We can fantasize, though, by popping in Terminator 3 and watching Skynet solve all humanity's security problems. (Well, it creates a certain 'availability' problem for the human system ;) The Matrix puts an interesting spin on it: from a certain perspective, the machines are the users/consumers and the people are the producers. We move from the top of the food chain to the very bottom... in one script. ;)
@Clive - Hi. To be clear, I am in no way associated with the MWG, I am simply a curious (if somewhat less knowledgeable) bystander. I think you raise a lot of valid points, I was simply trying to clarify what I thought was an error, but which may have just been a misunderstanding on my part. I thought you were under the impression that the MWG had only briefly taken over the c&c structure.. re-reading your comments now, I see that this isn't what you were saying. Yes, the botnet could potentially be regained by someone, somewhere, you are correct. And yes, the machines (as has been reported elsewhere) were infected with additional malware that the MWG does not necessarily have a view of and that could lead to misuse in other manners. This is my first time posting here and I didn't intend to start things off on the wrong foot - my apologies if I (as usual) managed to stick that foot right in my mouth.
@ Nick P,
"Additionally, we could combine my Castle and embedded server board ideas with your prison view to use a bunch of small, embedded Aegis-style chips running in parallel. The totality of these, along with a Master node, become the system."
The hardware sounds like it's as close as it's going to get without going fully custom (not that that is to much of an issue these days).
From the OS etc side if you say your "castle" is the "Prison Gov & Warders" then you are 99% of the way there, but coming from the top down not the bottom up.
The bottom up aproach adds some interesting twists that improve on what the top down aproach will give you.
And oddly with the bottom up aproach it's all mostly for free because of engineering in the "sweet spots".
First a little boring history as to how I arrived at the basic design.
The idea all started with a thought almost identical to your,
"I don't see regular code monkeys doing it right or even caring."
I'd like to say it all started at a security event for programers a few years back however the idea had been bouncing around my head one way or another for quite some time before that (actually nearly 20 years ago when I was thinking about doing a PhD. But could not find "Readers" to act as supervisors because they didn't get the problem...)
Any way back to the security sminar which was little more than an informal side event at another larger event.
The bod out front asked how many years production application programing time people had had, and it varied but averaged around 12 years across something like 50 people. So something like 600 years nose to the grind stone experiance give or take a little.
The bod out front then asked how many had had formal security training. Lets just say the total was around 2 days amongst the 50 or so production programers there...
So ~2days in 600 years in production code cutting or a ~1 day in 100,000 days (most progs work more than 7 days a week equivalent even if not paid for it)... And the programers at this walkin session where there because they had a personal incentive to be there. I dread to think what the rest of the industry average was like.
Yes it was starting to change more recently but then, well first some major companies and then the banks gave the world the one finger salut and in the resulting economic climate it has regressed more than ten years in many respects in about the length of time it takes to blink twice (or atlrast thats what it feels like).
Thus you can probably sympathise with my view that it is a waste of time trying to resolve the security problem any time soon by education of "regular code (cutting) monkeys". Irespective of if they want to make the effort managment etc are just not going to let it happen, 99 times in a 100 the urrent managment view is "liability mitigation" not "security", (which means the money goes to auditors).
I'll admit I'm perhaps a little jaudiced but on the "computing side" I come from a CLI and Embedded background. Unlike most code cutters I'd had to cut my own BIOS / OS etc as well as the application on top and get it out the door with "no patches possible" (yup that's the way it used to be ;). We had to be engineers not artisans or artists.
Also due to when I entered the industry I'd designed the actuall logic circuit guts of what we blythely call CPU's, and cut the state machines and microcode to bring them to life from a handfull of 74Sxx chips (anyone apart from me or NASA remember the 181?) or ECL.
Basicaly I've lived with abstracting design up the stack and seen how the faults slide in proportional to the abstraction and now get left to "patch tuseday" etc (and lets say I'm less than impressed with it).
You may know the current 10/20/30/40 rule of "application code error causes,
10% = Business logic errors.
20% = User input errors.
30% = Data timing errors from back end systems etc.
40% = Unknown "glitches".
The industry clearly has significant growing pains from the Terminal/CLI days. Only 30% of the errors are down to "human issues", 70% are down to code cutting issues. Much of which is due to code cutters not understanding "state" and "timing" or they and others how to "Stress-n-Test" for them (regression testing only catches "Known Knowns" and a few "Known Unknowns" as for probablistic testing or "fuzzing" that's what the users to do better...).
If you reflect on this sad state of the idustry and ignore the needless bells and whistles of the GUI etc for a minute you will see in many ways we are a lot lot worse off than the old "Unix Scripting Methodology" days. Mainly because we have very much got component re-use wrong, not just from the security asspect, but in general.
However also reflect abit on the Unix method and you will realise it is about as far as we've got in realistic "security seperated" code re-use.
You have three parts, the OS provides the plumbing and the CL utilities provide the base functions, and the script provides the overall program logic.
You see a simple but elegant split of duties with it. You have a script telling the OS to line up functions and the required plumbing to push the data through to achive good solid results.
So good infact that some systems that used the method to "Rapid Application Design" found they did not actually need to code much up (and in the case of many SysAdmins they had better things to do with their time).
In fact a new generation of Unix scripters came along with early web Common Gateway Interface (CGI) scripts for early web design. And guess what we are going to see similar pop up again with Cloud Computing.
Scripting this way is interesting because of the seperation, and how it also alows parellel threads to run etc.
Impoertantly though the OS takes care of a big chunk of "State" and "timing" which the script writer usually does not see or care about as it just happens. The (apparent) limitations of the CLI functions force the script writer to remove (needless) complexity and (usually) deal with exceptions. Oh and it is usually fairly clear at what point in the script things break thus making testing easier. All of which are good ;)
So what about "security",
if you think about it as long as the OS does the plumbing and data access securely then the script writer does not need to know about the low level security only the high level security. Thus a big chunk of the problem is taken off of them and they can get on with the business logic etc (saddly Unix security sucks by current needs and it needs addressing).
Thus you have small generic functions (ls / tr / grep etc) that act like filters. They have no need of security in general because of the way they work. That is the OS hands them data they filter it and hand the result back to the OS.
It is the OS that actually does the security via file permissions etc.
However there are issues with this the first is the Unix Security model is way way to granular and way way to limited in scope (though some tools etc have improved this quite a bit). The second is the functions have become bloated and overly complex for what they do 99% of the time and that is bad for security.
It also gives rise to huge globs of needless code which makes things oh so inefficient.
So sadly the code cutters won the day by weight of numbers and other scripting methods with less seperation etc where brought in. And some have been very very successfull but done baddly.
We see this problem with the proliferation of code libs that are available (have a look at the Perl CPAN resource or the Python equivalent to see what can happen with scripting libs).
Unfortunatly we have lost the seperation and thus the security and other desirable asspects in most cases :(
Now I have also had cause to think a little on Turing and Goedels little problems (halting and undecidability etc). And realised yes you could do an end run around them...
The problems arise mainly because it is a "single" universal engine "observing it's self" effectivly "from within", and there is no formal seperation of code and data.
So I thought OK lets keep the universal engine with all it's faults and many many advantages BUT we would add a second non universal engine to keep an eye on it.
How does that change the equation? Well actually in a limited domain such as a security hypervisor it changes it a great deal (it does not 100% solve it but that is another issue).
For security this second engine does not have to be universal at all (a simple "all states known" state machine will do). And you can make it in such a way that it always fails safe. You can remove "general looping and branching" and in many ways make it very like a DSP system that processess data but does not act on the data except when a limit is reached in which case it fails to safe (ie halt in this case). Oh and the data it processes is not the script data but the signitures that arise from the functions of the script executing in the Universal engine which is why a DSP like system is fine.
Further you do not have to use the rather useful von Neuman architecture, a simplified Harvard architecture is actually better for the job. Thus you can further seperate code and data compleatly (under certain constraints).
Thus you have your von Neuman universal Turing engine being watched by a subset of the Harvard architecture which is also a limited not universal engine.
Thus Turing and Goedel still get to play in the universal engine but Einstien belives his determanistic God of the Harvard engine is watching over them (he still gets to play dice though ;)
Thus you move from the notions of "security is not possible due to Turing and Goedel's problems" and "security is only possible in a fully known state machine" to the working middle ground.
Mathmeticians have already done this with a similar problem at Los Almos where Monte Carlo methods came to life (so the idea has a pedigree ;)
Thus you could call it Monte Carlo or probabalistic security.
So much for the over view.
1) Now if you simplify the view point a bit there are two basic types of security breach issues based around the inherant faults of the von Neuman - Turing engine,
1.1, It occures by accidental combination of faults.
1.2, It occures by the deliberate excercising of faults.
The second is AKA Malware.
Take a leap of faith for now that the accidental faults can effectivly be stopped by the system I'm outlining, I'll come back to it later.
The issues with Malware are actually quite simple to describe it is "extra code" that "should not be there".
2) Malware comes in two basic forms,
2.1, Built in.
2.2, Later additions.
The first I shall call "Malicious code" not Malware. The distinction is important because of the assumption that the code is "known to be good" when in fact it is not. And it is the main reason the current fad of "signed code" is currently a bit pointless (except where pointing the finger is concerned).
Malicious code "should be" detectable / preventable to a certain extent by formal methods used in the design and build. But it's early days on this and there are a lot of steps between current formal methods and code signing which are not covered thus allow plenty of scope for malicious code to be slipped in unnoticed.
Malware is what attackers add to a previously "known good" system to excercise inherant faults.
Importantly it is tangable "extra" code.
3) For this extra Malware code to survive even casual observation it has to be able to hide,
3.1) Malware is usually "trivialy" easy to find on a system that is not running. Simply by checking mutable memory (file systems etc) signitures etc. That is, it is "trivial" to determin on a byte by byte basis not to say it is "trivial" in time or work load. The exception of cause is those dynamic "pesky user" files...
3.2) To hide on a running machine Malware needs redundancy in the system it is on to set up what is in effect a Virtual Machine.
4) Two points arise from this Malware VM view,
4.1, If it cannot have the required resources it cannot hide.
4.2, No VM is perfect, time based and other signitures will reveal it's existance.
5 ) The less resources Malware has the easier detecting it becomes.
That is because it has less ability to camoflage it's self, and detect probing and fake responses.
Importantly this is not a linear trade off which means there are sweet spots to be exploited.
And this is why bottom up as oposed to top down has advantages because in general it's at the bottom where you find the best sweet spots.
[To see this think of car safety, that is a car seat belt is only "so so" you can spend more money but each linear step in safety comes at a power law or exponentialy increasing cost. Sometimes the cost curve is so steep it might as well be a brick wall. Likewise with an air bag.
However use both an air bag and a seat belt you get improved safety.
But... You can depending on the curves get improved safety using less expensive seat belts and air bags. And if you design them into the system from the start you effectivly get your improved safety for free or less...
You can do the same with other safety features like crumple zones and SIPPs etc. Thus get a quite dramatic increase in safety but for no additional cost.]
You just have to work within the design sweet spots from the very early stages of the design to get the best advantages by makeing the appropriate trade offs.
6) One trade off is if you take your program appart into little tasklets or fuctions etc the resource required for each part is fairly easy to calculate in most cases and there are libraries of code already in existance for parellel processing to show how to do this effectivly.
Thus, say you have 20 little von Neuman - Turing engines, you can limit the amount of RAM or CPU cycles or input or output, for each one to give it just sufficient resources to do the job and no more.
Thus each is a jail cell tailored to the requirment the function running inside it and very little more.
This put's preasure on the Malware in more than two ways.
6.1) The first is it does not have sufficient resources in any cell to be able to achive very much let alone hide via a VM.
6.2) The second is it cannot unlike the desired program communicate from cell to cell simply.
6.3) Thirdly if the tasklets are sufficiently small their complexity is low thus their charecteristics are much more easily calculated thus they can be for the whole program.
6.4) Forthly in a scripting language the functions can have base signitures without knowing what programe is going to use them. As long as certain bounds are known just before they are used the signiture can be calculated.
7) Due to this the programer does not need to know what the signiture is or care, they just have to know what the bounds are and pass them to the fucnction control channel.
Thus the OS tasking the script has three parts to it that. The first traditional Unix two that,
7.1, load the function (puts the tasklets in the jails).
7.2 Connects the plumbing between the functions that the data traverses.
Note I don't say objects for good reason, as most object implementations are designed to be efficient not secure, and thus can leak/mix code with data etc. They also tend to be overly complex as well...
It is the third bit where the security happens.
7.3, Each function sitting in it's jail is fed data by the OS has and has results taken away by the OS as normal. However unlike normal the OS also monitors these channels to ensure they stay within the security signiture bounds.
Further every so often the function is suspended and the jail is searched for contraband (ie extra code or data that should not be there).
7.4) The price that has to be paid is basically in "cheap" CPU workload not in "expensive" programer effort. And CPU ability doubles about every year for the same cost whilst programers want pay rises each year...
8) Now this brings us back to the original point,
What is the difference between the Castle and prison.
Actually very very little they both have,
8.1, A person in charge (Duke/Govnor).
8.2, Trusted staff (Knights/Wardens).
8.3, Restricted enviroments where limited trust workers work (Serfs/Trusties)
8.4 Closed environments where untrusted workers are kept and worked (slaves/prisoners).
Further they both have external and internal defensive systems.
9) The real difference between the two is in the tipping point for trust, and the basic design that follows on from this.
9.1) In a Castle most and sometimes all the people inside are trusted implicitly and interact freely in well resourced areas (which is the analagy of our modern office networks).
9.2) In a Prison most are untrusted are not allowed to interact at all and have very few resources.
9.3) In a Castle the security is by and large a perimiter issue.
9.4) In a prison security is mainly an internal issue.
Thus in a castle a low % of resourses are devoted to internal security and a higher % to external security and creature comforts.
In a prison the opposit is generaly true.
10) The trust tipping point also changes the view point of the internal resorces.
10.) In a Castle you look at cohorts of considerable numbers of men who can be trusted to use the resources at hand wisely to get small and large jobs done and they achive more per man for it, that is they are more efficient.
10.2) In a prison however you look at individual or one or two prisoners in a cell, their available resources are just sufficient to carry out small simple tasks and no more, and they are heavily monitored.
Thus it is the trust tipping point and granularity that make the difference between the Castle and the Prison.
In your Castle view the tipping point is near the OS level, thus the processess are like "cohorts" they are efficient but they are difficult to check for security.
In my Prison view the tipping point is down at single functions, thus they are like "prisoners" they are compleatly untrusted, monitored and only given sufficient resources to do their job.
And although less efficient the prison view is much much more secure.
11) At first sight less efficient would indicate higher cost.
However is that actually an issue?
As I noted above in 7.4 you are actually looking at CPU workload -v- software development cost.
11.1) Most if not all systems these days have CPU cycles to burn, thus they are already grossly inefficient so the cost there is effectivly less than zero.
11.2) The system should have minimal impact on code cutters, and may well enforce "de-complexity"
11.3) It will improve exception handeling and fault reporting thus easing not just test but field maintanence.
11.4) It will make users feel "enabled" and thus more responsive to the system.
11.5) It will also enable easy migration onto cluster style systems (think Cloud etc).
And quite a few other advantages not least of which is the increase in low level security against malware.
11.6) The downside is there is currently little hardware to support it. However as you noted above the chips to do it are starting to appear for other reasons. And the systems they end up in may well do "very nicely" as blades or COTS boards.
11.7) Ovarall the idea has more benifits than costs even without the increased security asspects. When the increased security is taken into account it has significant advantages for similar cost of existing systems.
12) Now a prediction or two for you that also tells you why I'm looking at it currently.
12.1) History shows us we go through computing cycles from central resources with user terminals/thin clients to self contained all in one desktop/laptop systems and back again.
12.2) We are currently swinging back to thin client with cloud computing etc.
12.3) CPU manufactures cannot alow this to happen as the development costs for low demand server CPU's is significantly offset by high demand client CPU's. If the world becomes too server centric then progress on CPU's will stall.
12.4) Thus what is likley to happen is that our migration to the cloud will continue and $aaS (replace $ with appropriate letter) will first be outsourced then brought back into house as the silicon venders move the tipping point to maintain revenue.
12.5) Along with this tipping point will be the putting of server type CPU's into more desktop and laptop systems.
12.6) But the spare CPU cycles will not go to waste as they currently do. The cloud will migrate back onto these platforms in a couple of ways.
12.6.1) The first will be the equivalent of SETI@home.
12.6.2) The second will be web servers on the client machine either as middle ware or middle ware and backend.
Such is the nature of re-use.
12.7) Cloud "Software as a Service" (SaaS) will see a strong "thin client" development for smaller more portable "connected" devices (netbooks mobile phones etc).
We have seen the start of this with Google's two OS offerings (Android and Chrome).
By current necesity the security of the client will get a lot lot stronger (something I started banging on about ages ago and Google actually got on with on the quiet). OS level security will migrate up into the browser, or conversly the browser will migrate down into the client OS (either way should improve security if not done the MS way with IE desktop intergration).
My money (and Googles) would be on upwards security migration with the web browser becoming the equivalent of the new user "graphical shell" and comms and data flows migrated back down into the OS to get "security seperation".
12.8) Thus we will actualy favour hardware based VM's and MILS kernels and also encorage them to become finer grained.
12.9) The question then becomes will the kernels split out a large amount of functionality into "user space" with the equivalent of the old Unix CL utilities. Simply by splitting out the "trust" into the kernel where low level security actually belongs (not in the apps).
12.9) If this spliting out does happen will the dual path (user data / security control) in the kernal and functions be implemented giving the "signiture" based probablistic security.
I hope that adds some background and clarity to the idea?
"This is my first time posting here and I didn't intend to start things off on the wrong foot - my apologies"
No you where right to question (nobody here is perfect we all miss things from time to time and as I'm getting on a bit... ;)
I'm sorry I jumped down on you, it's just that you came across to me as a "covert" PR rep. And I'm overly sensitive to them at the moment.
On Bruce's blog we have seen a fair few recently and some (like the "divining bomb detector") are obviously quite dangerous as people end up getting badly hurt or killed by their less than honest actions.
Some of those hurt and killed are civilians in distant lands who we may not know, others are troops we have sent who should be protecting them and they have friends and loved ones here. All deserve considarably better from us than Faux Security.
Any way, you desrve a more friendly welcome than I gave you, so excuse me while I change the foot I'm chewing on.
(Note to BF Skinner I still look like a "bear on a bad hair day", before you suggest an orangatang or some such 8)
By and large we don't bite (just chew 8), but we are sometimes rude in a friendly way.
However we can and do come across as a little blunt sometimes. For some reason it appears to go with the landsape (as any one attending some of the early crypto conferances will confirm).
A little sugestion, although I don't remember any other person posting as "Jane" on this blog, it is a common enough name to be almost anonymous so it might be wise to pop a letter or some such afterwards as "Nick P" does.
Oddly as others have noted in the past, my name is getting on for being anonymous as well 8(
Any way welcome, draw up a chair and stay a while.
In answer to your question...
I got a little email the other day rebuking me on letting cats out of bags re "how to enumerate Honey Nets", and "Headless Bot Net Control" from a (supposed) security researcher who works in industry.
So rather than tell you my thoughts I'll give you a link to some work by a well acknowledged "proper" security researcher ;)
Peter Gutmann makes interesting and easy reading (most of the time) and he usually calls it like he see's it, with practical examples.
He has coined the phrase Malware as a Service (MaaS) to describe one small asspect of what I mentioned to Winter about the "capitalisation of a botnet".
Put simply the "malware economy" is now considerably greater than the "drugs economy" which means that "buying a country" or two is within the "Criminal Information and Communication Technology" (CICT) Maffia's capabilities if the become more focused.
Look at Peter's presentation and then ask yourself two questions,
1) Are those who think botnet machines are worth only 5-20USD, realy well below the mark?
2) Do those who say botnets are for Spam and DoS and we know how to mitigate this deluding themselves?
If you are not sure just remember that we have already seen recently a modified Zeus attack tailored to .mil etc that was trawling for documents...
Now ask yourself what value such documents might have especialy if you can get them across the "Security Air Gap" which has often been looked at as the best line of defense as "they cann't cross it".
But is mostly not implemented correctly so makes the "Security Air Gap" at best a "Maginot Line" ( http://en.wikipedia.org/wiki/Battle_of_France ).
"A very nice write up about castles, prisons, and compartmentalization. Most certainly a reinvented wheel, though."
Actually not, read on to see why.
"Did you ever look at Bitfrost and the XO? The Android security model? They are going that way, and have implemented it."
As far as I remember the One Laptop Per Child XO model is based on BSD "jails".
I've had a look at the Android document you linked to and it appear Android has a similar problem.
They are both "top down" approaches to security and both stop their granularity at the process.
This lack of granularity makes generating signatures and controlling resources difficult as those who have played with SELinux are more than fully aware.
I have actually proposed two parts to the hardware "cell" which is a point you might have missed,
1, Minimal resources in hardware.
2, Signature control by hypervisor.
Likewise the program is not dealt with at the coarse "process level" nor the less coarse "thread level" but the fine "function level" (it could be done at the various levels under that such as the high language statement level or even the byte code instruction level).
The point is that a "cell" limits resources defined on what the function is to do and how it goes about doing it (the functions are built in like the old style Unix CL utilities and thus have well defined actions and fairly easy to produce signatures).
The resources made available to the cell via the hypervisor are,
3, Buffered File I/O
The function has no knowledge of which process it is part of only what it has been invoked to do via the script. The Script passes bounds to the hypervisor that tell it what the function should be accessing.
The hypervisor knows from this information what data to send via buffered file IO and what data it will get back. It will also know from the static analysis of the function what the CPU and memory parameters are.
Thus any malware has to fit inside of this very resource controlled cell and communicate via the controlled IO whilst still staying inside of the signature mask.
Further at any point the hypervisor can stop the function without it's knowledge and do a "cell inspection" to check what is there.
Thus the malware is going to have a very very difficult time (not impossible but getting close to it).
The advantage of this "bottom up" approach is it gives you a whole load of extras as well for free.
You can see that various researchers are nibbling away at the sides of the idea but they are still coming at it top down not bottom up and thus are constrained by their own outlook.
Oddly if you look at the Android paper you linked to, sections 7.15 and 7.16 are very close to the idea but just don't quite go the distance. The reason being I suspect is it would require a whole re-work to Android that is not going to happen.
The important thing to remember about the hardware hypervisor is it is not like a traditional von Neumann architecture but a more limited subset of the Harvard architecture somewhat similar to that you find in some DSP chips without going into the ins and outs of it this chip sits their as a warden looking at the cells. If it detects anomalous behavior it locks down the cell and passes it up the security stack.
One advantage of this Script/function approach is that it allows the functionality of unused parts of the program to be effectively removed. Thus for argument sake you have a music player that goes to the Internet to download track and title info etc. You can simply say to the security system not to give any resources to that particular function and for it to return a null string etc. Provided the code cutter who wrote the script followed proper exception handling then the script should carry on without it or if the function is vital for some reason, stop the script and produce a meaningful error message.
SELinux can do coarse grained resource control but it is only at the process level which means it cannot do the above unless it is built into the program which we known your average code cutter will not do simply because of "efficiency and features".
I'll leave it up to you to decide if my "wheel" is engineered for a function or artisan made for any horse drawn cart that comes along...
"As far as I remember the One Laptop Per Child XO model is based on BSD "jails"."
That is only part of the trick. They use the KVM to run each and every application. Then each application is run in a chroot jail.
Bitfrost is very much about protecting the XO from hardware wear-out attacks (it runs on a limited battery and flash). So they monitor resource use, including CPU use.
Furthermore, each application is installed with unchangeable permission for use of drivers. So a word processor or photo app have no access to the network etc. Also, data organization is different which allows a better control of what application can access what bits.
This does sound to go in the same direction as you do.
But remember that the XO is intended to be a limited power device for children. It has to be practical and secure for use by children in a school environment. This means, they do mostly cover remote network attacks. If someone can get their hands on the hardware, the most they do is brick it.
All in all, I have not seen any chinks in that armor (but feel free to enlighten me).
The android is different. For one, they do not use KVM.
Alright, time for me to step in. I'll start by saying I like BitFrost, but not Android necessarily. BitFrost takes something we do well and improves on it in many ways. I'd love to see RHEL Desktop modified like that. It's not secure, though. How can I say that?
1. Hardware. Processor has known bugs, some potentially exploitable and several can be used to obscure malware. Firmware is unvalidated against subversion by end user, but many are like that. Any other system level attacks might be feasible, but let's move on to more realistic stuff.
2. OS. The OS is Linux. BAD choice for a "secure" system. When you look at an OS's kernel-mode code and see comments like "Why is this here?", you should worry. Getting several vulnerabilities/bugs in a month isn't comforting either. The principle issue is that the number of latent defects is usually proportional to complexity & lines of code. The Linux kernel is huge and each bug in it potentially has all the powers of kernel mode available to devastate reliability & security. OS's like QNX Neutrino, INTEGRITY & OKL4 manage to keep everything but the small kernel in user-mode and still run fast. Why does Linux continue to do the opposite? Linus hates microkernels. See Linus vs. Tanenbaum.
So, just using a monolithic OS like Linux increases the TCB and attack surface many fold. To see the difference in a practical application, look at Micro-SINA where they switched SINA VPN from Linux to L4/Fiasco. The difference is astounding. The best engineered microkernel-based OS I've seen is INTEGRITY: it's architecture makes it almost impossible for any app to exhaust kernel resources and rigorous development/certification drives up assurance. Linux couldn't begin to pass one of those "correct by construction" evaluations. Neither could any other monolithic OS that I know about.
3. Drivers. In Linux, drivers are in the kernel. There are some tricks to get them out, but they aren't normally used. In microkernels above, drivers are in user-mode and given certain hardware access privileges. If a driver screws up, it doesn't crash the entire system via memory corruption or CPU overload. In monolithic system, one must trust the device drivers. Two methods: signature & extreme validation (think Microsoft's SLAM). Signature proves source is correct, but driver can still be buggy. Without an IOMMU, malicious drivers are still a problem, but monolithic systems must trust them totally. Result of an exploited bug in kernel mode? If you've ever used a few different graphics drivers in Ubuntu, you will know what I mean.
4. Resource control. This is a big issue that crosses over 1-3. The basis of a reference monitor to ensure security is proper mediation of "subjects" access to "objects." Subjects might be cells, users, processes or something else. Objects may be CPU time, memory, devices, file system, etc. Every subject and every object must be accounted for by the kernel in a very validated way. Anything with a shortcut around this will invalidate the security model. Additionally, the mechanisms and security model should be designed/built in such a way as to be practically feasible, like Bell-Lapadula MLS in military and MILS separation kernels in aerospace. Monolithic OS's, POSIX compliance, and bloated native installs are the exact opposite of this. Security enforcement is everywhere and nobody understands how it all fits together.
5. Abstraction & Composability. I know I've slowly moved from specific issues in Bitfrost to general security engineering issues, but bear with me because these are all relevant. I define abstraction quite like OOP: the implementation is hidden by a simpler interface. The overall system is "composed" by the interaction of the components, which are layered side-by-side or on top of one another. For a good example, see the GNTP (GEMSOS) evaluation or BAE Systems XTS-400 (linux compatible). They use strong layering and a protection ring model to make for a strong security architecture that still supports legacy (untrusted) code.
The idea is that we build each layer & test/verify it independently. Then, we verify the system as a whole by verifying the interactions of layers/components against system requirements. You simply cannot do that in systems as complex as Linux or even Android. Android does do some layering that improves its security, but the bottom layer has the problems mentioned in points 2-4. Because of this, one can't be sure things are interaction properly, data isn't leaking, etc. Under the hood, the system isn't deterministic enough to meaningfully evaluate to medium to high assurance. Additionally, it has a shitload of untrustworthy components in its TCB, bringing me to point 5.
5. TCB & non-TCB support libraries. If it's needed for the device to work properly, then it's part of the TCB or at least a functional requirement. Either way, if it had plenty of exploitable vulnerabilities (cough webkit cough) or is sufficiently complex, the system that depends on it cannot be considered secure. Then there's the new Java VM implementation. (bughunt round 2!) Sure, there may be limitations. Maybe the best the enemy can do is DOS, but why should even that succeed? If you ask me if Android is a secure platform, you are actually asking if Linux, Android stack (including webkit), & Java libraries are secure. That's a lot bigger than "phone OS" would suggest. ;) Bitfrost is a hardened Linux distro with some security features. Is a hardened Linux secure? See 2-4.
6. The Internet. This is when it gets extra crazy. Before this thing entered the picture, the device was in the hands of a trusted individual who presumably wouldn't destroy their own crap. Internet will bring many attack vectors, which I need not mention, that every mainstream OS (incl. Linux) has failed to defeat. Malware comes in all shapes and sizes & targets vulnerabilities obvious and obscure, straightforward and esoteric.
Conclusion: If you want a secure system, then you must secure its building blocks and their interactions. If you want secure data, it must be secured (somehow) at rest, in use and in transit. By secure, I mean I don't loose everything, my identity included, at least once and restore several dozen times a year. When Clive and I discuss med to high assurance, we are talking about *real* security. Fortunately, we are just discussing software right now. Defending against physical or hardware-based attacks, like against radiating something or introducing voltage glitches, requires more than I mentioned. If you doubt the importance of this stuff, look up the recent article about 1,024 bit RSA being defeated by a powerline attack. That's f***ed up. ;)
How to build a secure system? Well, there are many methods. I've been espousing a MILS approach, but with more dynamic scheduling. The important thing is to follow the concepts. Look up Saltzer and Schroeder's info security principles for a start and then Schell's GEMSOS Lessons Learned paper. The basic lesson is good requirements gathering, unambiguous specification of requirements, extremely rigorous implementation, and strong correspondence between requirements, design & implemenation. Modularity, layering & avoiding common pitfalls are always good sign. Event-driven model easiest to pull off, which is what open-source fully verified seL4 did. Best to learn what is secure (and securable) by looking at examples. Aesec's GEMSOS crypto seals, Nizza/Perseus Architectures, Nitpicker GUI, CapDesk desktop, OP web browser, OKL4 secure hypercell (in tons of phones) and INTEGRITY's Platform for Secure Networking & Mobile Devices. For hardware, look at General Dynamics line, esp. INFOSEC processor, and MIT's Aegis processor. At system/integration level, look at Tokeneer EAL5 open-source demo, Praxis's other work, Galois's work, anything MILS-related, LOCK project (doc's online), BLACKER VPN, Rockwell Collins' Turnstill, and Sectera Edge.
These products, whether demos or products, all embody many good principles of secure architecture, development, execution, and/or maintenance. A few, such as GEMSOS-based BLACKER & INTEGRITY, have been in high-risk military products for over a decade without failing. XTS-400 & Boeing LAN have done pretty well too, even being medium assurance by comparison. So, I could expand the definition to good principles plus years of survival against people who know how to knock weak systems down. I mean, there's still configuration and securing apps and such, but a secure TCB solves so many problems for the developer. I mean, OS's like Linux don't even provide a trusted path from user to arbitrary system process. Every other solution I mentioned does. Unspoofable login screens, strong guarantees against keylogging between domains, etc. If we go with good design/implementation/validation and maturity, then INTEGRITY or GEMSOS are best way to start new secure platform. Bitfrost, Android, and Linux might be "good enough" for casual users in a risk analysis, but calling them "secure" is just verbal masturbation.
For those interested in the RSA power line attack
It was carried out by three people,
It was presented on the 10th of this month at DATE.
You can get a copy from Valeria's pages at U Mich,
In many respects it is an important paper for the open research community.
In essence what they have done is apply a simple "fault injection" to "harmlessly" attack a standard machine and standard off the shelf software and reconstruct the 1024bit RSA key.
To use their words,
"We report on the first physical demonstration of a fault-based security attack of a complete microprocessor system running unmodified production software"
And the software attacked is,
"We expose and exploit a severe flaw on the implementation of the RSA signature algorithm on OpenSSL, a widely used package for SSL encryption and authentication."
The reason this is possible is the way the software has been written in a "linear and always repeated" way a failing that is in much software and realy should have stopped twenty years ago when the first "signiture attacks" against Smart Cards surfaced to become DPA in the late 1990's...
Oh and as I have known for thirty years and said on the odd occasion, the same attack can be carried out by EM Field fault injection which concevably could be done from 200ft away using equipment from the surplus radar equipment market which has been modified to allow AM/pulse modulation of the "fault signature" onto a 10Ghz or equivalent carrier (the dificult bit is "locking the fault injection to the system under attack").
This is the second paper this year which has shown how a simple attack can have devistating consiquences on systems.
At the root of the problem is the issue of "efficiency -v- Security" allowing an exploitable side channel a theme I have banged on about for quite some time (as some of you will know).
Great write up. Obviously, I can only agree. I know several kernel developers are ranting against Linux insecurity on a regular basis (eg, Alan Cox used to keep a special "secure" kernel tree).
But I am afraid we are drifting towards the "embed it in concrete and dump it in the Mariana trench" security.
The aim of computer security is to get to a point where the expected losses (risks) due to security breeches is much lower than the profits from using the computer.
Increasing the cost, in all meanings of the word, of the security can quickly get you into the situation that it becomes pointless to actually use a computer. That was probably one reason the Android does not implemented the Bitfrost model.
The OLPC is a case in point. The aim was to supply children with a low cost laptop with cheap (free) software. This makes any OS other than Linux a non-issue. Bitfrost had only two goals: 1) protect the children 2) protect the laptop. Obviously, they rather brick the laptop, data and all, than endanger the children. This is completely different from the economics of an average (military) workstation, where the data are worth more than the hardware and user combined (I kid).
The optimal solution for the OLPC is to keep the cost of the security below the replacement cost of the laptops. That is, simply (remotely) decommission compromised laptops and keep backups. I think Bitfrost went a long way into reaching that optimal point. And it works for *children*, security without passwords and the like.
The security policies you presented would make the OLPC impossible (at least for the next decade). It would also make most home use of PCs impossible. It would simply become too expensive and too cumbersome to use a secure computer. At least until some secure OS along these principles arrives, and enough applications are ported.
So I do think Bitfrost is a step in the right direction. Yes, it is a step, and not the whole way. But to get somewhere, you need to make one step at a time. Jumping tends to make you fall and break bones.
Bitfrost works, even on limited hardware, and even a child can use it. If all desktops would implement it, that would be a huge improvement.
And, what would be the next step from Bitfrost?
Well, sure, we definitely must balance the cost of security against the goals. Your example of Android was quite proper: cell phone users want to be very expressive & wouldn't have purchased a cell phone that works like BitFrost. I have to clarify that the big point behind the stuff in my last post is that it can all be seemless. For instance, a processor that works, driver updates that are certified/signed, drivers/services in user-mode instead of kernel mode... The list goes on & the user wouldn't be able to tell the difference between a secure microkernel w/ Linux API & a hardened Linux kernel. Additionally, the secure program design principles would be done by developers and the installation routines might, a la Bitfrost, tell the system what privileges the app needs. The main difference isn't the usability or surface behavior: it's what goes on under the hood & behind the scenes.
Cost is still an issue, though. Obviously, securing the platform & producing low defect software will add to the cost. I think the cost of securing the platform is worth it. With a secure platform, we could build solutions to many other problems. Without one, we *cannot* solve the other problems. Billions of dollars are annually wasted on security solutions built on weak TCB's that invalidate their security. I think spending even tens of millions on developing a secure platform has huge payoff. The good news is that, as in previous post, they are already here. They just need to be customized for desktop use. Currently, though, a high assurance COTS solution like GH Platform for Sec. Networking gives an OEM secure RTOS, drivers, partitioned networking, partitioned filesystem, virtualization (user-mode), compatibility layers & a top-notch IDE for a few hundred thousand one time. A few hundred K one time to solve 90% of the technical problems? Sounds like a steal to me. I'm just not an OEM with deep pockets. (tears surface...)
So, my conclusion is that, for even mid-sized firms, secure software is affordable and usable via great tool & middleware support. Cleanroom, Praxis Correct by Construction, & Galois Haskell tricks have cost effectively produced solutions with little to no bugs & no dead (useless, maybe malicious) code. Cleanroom is easy to learn & extensive studies support its defect-reduction claims. So, developers can get in on the action too at least for security-critical software. For these basic issues, it's not the cost: they just lack the will. They don't care because the market doesn't care. The person who solves the security problem will be the person that gets the market to care and shows suppliers how to do it affordably and as painlessly as possible.
So, I've talked plenty about them. I threw out nearly every relevant example in the previous post, but I don't know if you had the time or will to Google all of them. How about I give you a few links where you can look at real-world versions of what I'm saying & answer the question: could lay users benefit from these & do any examples seem good enough to deploy further? (Note: to prevent moderation delays, I only provided 3 hyperlinks. don't know the limit & hate waiting for spam filters. sorry for inconvenience ;)
INTEGRITY Secure Virtualization (top solution I've seen; see their examples on bottom)
Solutions Using INTEGRITY Separation Layer (note that they are very practical)
Perseus Security Architecture (whole site is a great read)
Recent Nova microhypervisor (Nick: hypervisors done RIGHT!)
at hypervisor DOT org
Turaya Desktop is a Perseus-based Data Loss Prevention suite if you want to look it up.
@ Clive Robinson
"The resources made available to the cell via the hypervisor are, 1, CPU 2, Memory 3, Buffered File I/O"
I swear you might as well be talking about MILS. Your prison approach & their castle approach parallel so much. We could have an almost-prison by building a MILS castle where all the apps were decomposed into primitive functions, each in their own partition & POLA enforced on communication, which also runs critical portions through guards. If the smaller functions were like black boxes, as in previous designs we've discussed, they wouldn't even need monitoring: just the main active components. That means software guards could be used with less of a performance hit. Separation kernel would also impose CPU/memory restrictions you mentioned. The only thing that would be missing was the trusted, DSP thing doing the monitoring: it would be replaced by software guards & probably use FSM's to track state & do safe fails and such. Like I said, your method and the modern MILS approaches are very similar. They usually partition at the component or subsystem level of granularity for performance reasons, but you can do it as small as functions if you like.
A good example of this is OKL4 "Secure Hypercell" tech in smartphone virtualization. It's a fancy word that means it's a small, ultra-reliable, capability-based microkernel that can run/manage individual apps (which may be component functions of others) and VM's via virtualization layer (user-mode) side by side. They aren't a separation or MILS kernel, but it's pretty close in function/assurance/performance. So, you start by running your app in VM (Linux, Symbian, etc.). You take security- or safety-critical code out & put into separate partition. It's TCB is suddenly that of the microkernel. As they point out, you can further decompose the app as much as performance, time & labor allow. Aside from real-time monitoring with coprocessor, doesn't this seem pretty close to what you mention? Not the same, but very close? The source of those claims is linked below. They have nice visuals of how it works, along with a nice field track record to back it up with.
Thanks for the links.
@ Nick P,
"I swear you might as well be talking about MILS. Your prison approach & their castle approach parallel so much."
Yes and no MILS has similar granularity.
The bit it lacks over and above the "DSP" bit is having the pre built "function" components with pre built signitures to monitor.
One of the points behind this is to remove a big chunk of the low level security issue from the equation (ie code cutters).
The programer (much as in the old IBM assembler) selects a function to do something such as read in data from a file (or the Unix equivalent etc).
This function comes in two parts the bit the programer wants and the security signature which is passed to the hardware hypervisor. The OS adds the required values to the signature to make it fit to the job in hand and limit the require resources right down.
Thus a chunk of low level security is there without the programer having to do anything to get it.
Further the smaller the function the tighter the signature can be bound to the function and the easier the signature is to produce.
Also you force not just a standard exception handeling model but also better error mesages and better logging into the system at close to zero extra cost.
This is the bit you get for free doing the bottom up aproach which you tend not to get from the top down aproach.
Don't get me wrong I like MILS I just think it could do a lot more for the same money with a slightly different underlying architecture.
As an analagy in cars Side Impact Protection Systems,
The top down aproach makes them an expensive addition to an existing design.
The bottom up aproach on a new design builds them in as a structural feature alowing optimal placing etc.
The for free bit, converting the car design to a soft top is oh so easy with non of the old problems where the design just assumed the doors etc where their for the niceties not for safety.
It's the way you look at things from the outset before even the requirments analysis has pen to paper.
I guess biggest problem I have with your solution is that I don't understand how you are going to implement it. I'm mainly a software guy, so this is pretty low level for me. It seems like your DSP is treating the CPU & external stuff as black boxes, watching the IO & keeping track of state. I'm wondering how you will get the CPU & DSP to work together that way when the CPU's internal state is very important. In modern systems, the CPU is what loads the initial trusted state: if it screws up, how do you even boot the fail safe? And is your DSP solely looking at bit patterns or does the CPU tell it certain things (e.g. beginning to read file from device 2 to this address space)?
@ Nick P,
"I'm mainly a software guy, so this is pretty low level for me. It seems like your DSP is treating the CPU & external stuff as black boxes, watching the IO & keeping track of state. I'm wondering how you will get the CPU & DSP to work together"
The problem I think you are having is you are thinking in terms of the "monolithic slab CPU" we currently have from Intel and AMD etc which has like the "saber toothed tiger" over specialised to do just one evolutionary task well to the detriment of all others.
Go back to a "simpler time" around 20 years ago of PC accelerator cards using multiple RISC CPU's each with their own private memory block and no need of paging and other Virtual Memory tricks etc.
Each RISC block exists in a "world of it's own" when running just like an embeded microcontroler. Think about them from the "simple embeded microcontroler" view point of CPU control bus, memory address/data bus and I/O bus (look at the original ix86 CISC design when x was less than 4).
That is the RISC CPU comes out of reset and executes it's "embeded program" it reads data from an I/O bus "in" device processes it and writes it to an I/O bus "out" device. If the I/O bus is held the CPU halts/pauses (stalls on I/O).
If you think on it a little further it is a hardware representation of a simple Unix Command Line piplined process.
[The C-CPU has no knowledge of security that is done by the H-CPU and system "OS" which sets up the "plumbing" of IPC via "pipes" or "sockets".]
Now imagine a high end program written as a script that is split into all it's sub functions. All the sub functions are "pipelined" together by the system "OS". In simple cases this would be less "efficient" than a monolthic code block but only in CPU "grunt".
From the "embeded microcontroler" view point, each sub function is the "embeded program" in an "embeded controler" all sitting on a backplane with multiplexed I/O and control buses (as seen in high end Cray and Sun Starfire boxes and IBM Z servers).
This is a proven method of doing "parellel processing" which due to the proliferation of COTS has been virtualy eclipsed.
The addition I've made is the hardware hypervisor doing it's "DSP" on the "function signature".
So yes "when running a function" the compute engine (C-CPU) is seen as "black box" from the outside, and from the inside "it's own little universe". However the "system" does not get to see or talk to the C-CPU engine (CPU+Memory+I/O) it talks to the Hypervisor (H-CPU) that watches the C-CPU when running, and modifies the compute engines memory when the CPU is not running.
The system sees the Hypervisor engine as being on two seperate buses the traditional "data/comms" buses and in addition the security bus (more on these buses and backplanes later)
So the "data flow" in the system passes data to the first hypervisor which passes it down to the compute engine that filters/processes the data and hands it back to the hypervisor which passes it back to the system. The system then takes this processed data and passes it to the next hypervisor in the line.
The only real issue is getting the "embedded function" onto the C-CPU.
Apart from the top control "King/Gov" engine all the other engines have their memory loaded and locked for them (MMU's and DMA done right are a security plus, more on that later).
Ignore how the top control engine starts up for now just assume it comes up "secure".
All the other engines are held in reset by their respective hypervisor(s), which in turn are held in reset by the system.
The Gov engine takes control of one of the non IO engines and it's attendent hypervisor and loads an image into it to act as a simple security schedular for the lower level hypervisor(s).
The Hypervisor as I said is a simplified Harvard architecture and has a code side and a data side. The code side is effectivly "ROM" (see how some DSP chips load code into themselves after reset from a "hidden external EEPROM") and remains non mutable during the hyporvisor execution. The data side faces it's respective von Neumann engine(s).
Conceptualy each time a "cell" engine is used for a new function,
1, Both the hypervisor (H-CPU) and compute (C-CPU) engines are reset,
2, The H-CPU engine gets the signiture and bounds loaded on the code side.
3, The C-CPU engine gets the function code DMA'd into it's restricted memory space via a one way private bus.
4, The C-CPU engine MMU is preloaded and locked.
5, The H-CPU engine is taken out of reset and it takes control of the data I/O to the C-CPU.
6, The H-CPU takes the C-CPU engine out of reset.
And the "black box" is now running a function.
Back to the DMA and MMU and security. Done right it can be made very very secure. It is a question of control and bus isolation.
DMA of the form I'm looking at is quite simple, you have the hypervisor CPU (H-CPU) engine outside of the Compute CPU (C-CPU) engine. The C-CPU has no ability to see anything other than it's memory and a simple I/O status register. The memory is controled by the DMA controler which is controled by the H-CPU. Thus the C-CPU just sees memory buffers where data "magicaly" apears and disapears and an I/O status register telling it when the magic has happened.
The H-CPU does the "magic" in four ways,
1, Load memory images when C-CPU is held in reset.
2, Load data into "read only input buffer(s)" when C-CPU is "held/paused".
3, Unload data from "write only output buffer(s)" when C-CPU is "held/paused".
4, Inspect all memory for "contraband" when C-CPU is "held/paused".
The MMU is likewise controled by the H-CPU and the C-CPU has no knowledge of it. Thus the H-CPU allocates only sufficient memory and buffer space to the C-CPU to do the function. Thus any low level malware is starved of resources.
The problem with nearly all COST CPU's is the assumption that the CPU control's the MMU to do pageing etc it's self, which means if malware gets "ring 0" it's game over.
The whole point of this design is the C-CPU does functions of scripted programs and thus does not need to page memory in and out or multi task or any other of those "kernel functions" where low level security holes happen. Thus you can take a simple "off the shelf" FPGA design and hack the design down and providing there are enough pins you can put both the H-CPU and C-CPU on the same FPGA.
If you where going into serious production you can see that with quite a small modern silicon area you could have four "cell CPU" cores each with a private chunk of the 8Mbyte of internal memory and one main hypervisor and four slave hypervisors all in the same footprint a mid to low range Intel/AMD laptop CPU currently has.
Which brings me onto backplanes and data/control and I/O buses.
I/O to the real world is effectivly "untrusted" as it should be and the usual MILS Castle aproach applied.
The low level of the Prison aproach is to leverage "parellel processing" and "scripting" techniques into MILS and other security architectures to get rid of code cutter "glory balls" of programs in a way that,
1, Reduces the granularity to "proven secure functions".
2, Where the secure functions have low complexity and low resource needs.
3, Where the low complexity alows effective easily tailored security signitures.
4, And the resources can be tightly fitted and controled around the function.
4, Where the tailoring and resource control is done by the build process, not the code cutter.
Thus the low level security against malware is taken almost compleatly out of the code cutters hands. Which means you get the low level security at the cost of system "grunt" but this comes cheap by comparison with traing code cutters and doing low level code audits etc etc.
But as you are doing bottom up engineering you get a whole host of freebies.
First off as I said iX86 wher x is above 4 is overly optomised for single chip "specmanship" and is mostly a compleate waste of silicon in multi CPU systems (it is why we are now seeing Intel go down the multi core route). Thus the silicon that can be freed up will provide plenty of gates for the hardware hypervisor, and still have gates spare.
Forcing parellel programing onto code cutters is very good, it makes their product scalable onto SaaS or "cloud" systems much much more efficiently
Forcing the parellel programing to use trusted functions that are effectivly stand alone programes (ie scripting) forces the correct addressing of "exception handaling" and "error messages" onto the code cutters via the build tools. As these are the root cause of well over 60% and as much as 90% of "in the field" problems we would expect some real gains to be made in this area.
There are other gains such as "code reuse" will actualy work, but hey you want the bad news ;)
Firstly there will be increased overhead in terms of hardware and a decresse in efficiency.
However looking at previous systems that have been properly designed to be "reliable" and "scalable" these extra overheads are relativly small (ie a few %). Which means they are very much irrelavent in most cases.
The usual "object orientated" code aproach won't work. This is not an object issue but a code cutter issue. As we know from experiance objects fit well in user interfaces but just don't work in transactions (try mapping objects onto an RDBMS properly, if you think pain is pleasure you will be in seventh heaven). The reason is simply the wrong view point.
Any way many "objects" try to hard to be "all things to all men" which frome an efficiency or security view point is just bad news (just compare the C standard library to the C++ standard libraries to see why this can be bad news).
Oh and the system is limited in what "security" comes for free. It obviates the malware attack issue but does not address the issue of "malicious code". That is if the code cutter deliberatly puts in malicious high level code this system is not going to detect it (otherwise secure programs would just write themselves). But it does limit most of what a malicious code cutter can do to the point where they are squeased out by "formal methods"...
Alright, I've archived the entire discussion for future reference. I think I'm beginning to see some of the pieces come together. I still have qualms about the signature approach being unable to differentiate between malicious & non-malicious in real-time, but your design is reset-based & hence would prevent long-term reinfection. Are you even really analyzing the IO with the monitor or are you just using it to direct, examine the memory of, and reset the compute CPU's? In previous discussions, you described it somewhat like an IDS, but now it just seems like a real-time, code-integrity checking engine for compute nodes.
Another issue, possibly due to my limited hardware experience, is with the silicon requirements. A typical platform is millions of lines of code which probably correspond to hundreds of thousands of functions, each requiring a computer/hypervisor pair. Even the simplest CISC cores take up a significant chunk of any affordable FPGA. Custom designing functions & acknowledging they will probably be multiplexed to reduce compute node requirements leads me to believe the system still needs almost a hundred thousand FPGA's or tens of thousands of high end FPGA's. That's a lot of FPGA's, Clive. Even assuming one could do a lot with an Intel-type wafer, Intel's manufacturing processes are far ahead of the curve compared to FPGA's, proprietary, and unlikely to be released to give them competition. They're too invested in Itanium for mission-critical apps. Looking at ClearPath, Tilera, and NVIDIA parallel chips, I don't see anyone who can sqeeze enough multiplexed functions on a chip to make this design cost-effective.
@ Nick P,
I'll try to be quick a tads more indepth on this reply without getting to deep.
"I still have qualms about the signature approach being unable to differentiate between malicious & non-malicious in real-time"
I can understand that as it takes a whole different "world view" from normal (which is why I keep talking about bottom up not top down design).
Think not of "whole monolithic programs" in the normal sense. Nor of programs in scripting languages like Perl. Think of small functions made as micro programs that are scripted via IPC (Unix shell script and CL utilities).
Each function is very small has a minimal number of loops and next to no storage. It is more of a "filter" building block than a "user program".
The signatures are "heartbeats" either put into the function by the build tools or exist naturaly via the I/O buffer requests. Thus the heartbeat type and number are known to the build tools that make the "skeleton" of the signature. The hypervisor knows aproximatly what order each heartbeat comes and importantly when and what it's tag should be.
Any change in this expected heartbeat behaviour is not good news and raises a security indicator for further consideration (ie a cell search).
Thus a "do while" loop would have a heartbeat at the begining to say it's in the loop and what the limits are, a sub heartbeat at each itteration has the loop count and while test input and result. Likewise any break conditions have heartbeats.
Thus any malware would have to know what nearly unique heartbeat had to go next and when, which would require it to be able to analyse the state of the prison "cell" before the valid heartbeat time out. And as the cell is resource constrained...
The heart beats enable not just the signature but also a very good debug/diagnostic for fault tracing and error messages and why any exceptions occur (remember nearly 80% of faults are not code logic faults).
All unexpected exceptions cause an automatic "cell search" and dump of stack and heap to log for analysis by the hypervisor.
"Are you even really analyzing the IO with the monitor or are you just using it to direct, examine the memory of, and reset the compute CPU's?"
It is both although I'm not having to do to much IO content anaylsis at the moment it's a work in slow progress (as am I of late ;). Also it may not be required in many cases for subtal reasons.
"In previous discussions, you described it somewhat like an IDS, but now it just seems like a real-time, code-integrity checking engine for compute nodes."
It's probably because I'm focusing in on individual bits at a time.
The heartbeats and the I/O checking give you the signatures for IDS, which also give you some of the code-integrity checking. The random "cell search" gives you full code-integrity "Monte Carlo" style.
So there is no gaurenty low level malware cannot get in briefly but it would have to be something quite extrodinary to do much other than be detected quickly.
"A typical platform is millions of lines of code"
It depends if you have a monolithic kernel like linux it currently has some 12million lines of code that go into it.
If you look at the Minix 3 kernel it has around 4 thousand lines of code.
If you look at the bottom of the C Standard library there are only a very small handfull of functions that need kernel support.
Going to an I/O Buffer model strips that down even further think of it conceptually like that of the bottom USB comms layer (which I'm seriously thinking of doing in silicon) in some respects.
In other respects IPC by jumping windows provided by the simple VM MMU controler on a block of external memory (which is like software passing by refrence rather than passing by value, and can have similar issues).
"which probably correspond to hundreds of thousands of functions,"
Err no actually about 200 hundred are all that are required and only a handfull get regularly used.
"each requiring a computer/hypervisor pair."
Only whilst they are in active use. Remember that like any tasking computer you page out and task switch to make optimal use of resources.
When you staticaly link the functions a lot of the code bloat and security issues vanish. And the function code becomes very small.
Think of the difference between the Unix command line and stream editors "ed" and "sed", and a wysisyg editor like MS Word. Then look at a point or two in between such as vi or jstar and web based editors such as that that appeared in the likes of GMail. There are many ways of doing the same things and each has it's advantages and disadvantages. But with a little ingenuity you can do what vi originaly did which was to build on a much simpler primative or function (ed).
"Even the simplest CISC cores take up a significant chunk of any affordable FPGA."
Yup and CISC is most definatly not the way to go it's an evolutionary dead end as Intel have discovered.
Some of us are old enough to remember what ARM stands for (Acorn RISC Machines). And also building CPU's out of early Texas DSP chips to support "Forth" in a significant way which actually only needs 27 base instructions to work well which kind of makes even most RISC CPU's look bloated.
You do however have to add a few other hardware instructions for Maths and buffers but it's an easy tradeoff senario in most cases.
[Forth for those that are not familiar with it is a "stack based" RPN language which has been known to beat optomised C code hands down. The RPN and stacks make analysing it very easy and light work.]
Just to show how crazy it can be sombody developed a LISP system on a Forth engine the performance results where at the time eyebrow raising to say the least.
"Custom designing functions & acknowledging they will probably be multiplexed to reduce compute node requirements leads me to believe the system still needs almost a hundred thousand FPGA's or tens of thousands of high end FPGA's."
Err no the functions are software images just like any normal program. The Kernel code is very minimal and modular and the C-CPU very "light weight RISC" in most cases (and not "pipe-lined" which causes all sorts of efficiency issues).
Function code is compiled into static "micro-images" (as are code moduals in embeded systems) and is loaded very rapidly into the C-CPU memory space through the H-CPU.
[There are ways where you can bank switch the memory and registers to the C-CPU so "context switching" is just two or three clock cycles, such are the joys of stack based designs.]
IPC / data comms is done currently by memory mapped buffers into a large shared external memory space via a simple VM MMU controler. Thus the results in the output buffer of one "function" become the input buffer of the next function.
Importantly from the security context neither the H-CPU or C-CPU have access to the control side of the VM MMU. Thus they have absolutly no knowledge of the external memory just a view of one or more buffers in a fixed place in their memory map.
[This is done by the externaly controled simple VM MMU controler the C-CPU simply changes a status bit to say it's finished with a buffer and halts on the external system updating the simple VM MMU controler to the new external memory window and clearing the status bit. This can usually be done as a page counter with address offset so is very very fast (just a couple of clock cycles).]
Imagine the IPC as a circular buffer of buffers with multiple functions using the one circular buffer of buffers in "lockstep".
Thus imagine the functions lined up (A,B,C,D,E,F) each with one "in" and one "out" buffer. B's out buffer will be switched to become C's in buffer likewise C's out buffer will be switched to become D's in buffer. Thus C will stall/block if B has not finished writing to it's out buffer, preventing it being switched to become C's next in buffer.
However A's in buffer gets fed from a "file" and F's out gets fed to a "file". These "files" are again just buffers to either large memory buffers for HD files or heads of streams.
"Even assuming one could do a lot with an Intel-type wafer"
Ahh my mistake I was using Intel chips mainly as a "silicon area/size" comparison, not a direct function comparison.
That is to give an indicator of gate count pin count internal memory potential and cost that can be achived. Thus alowing a mental comparison to how many of these little beasts can be put in a package (and might well have done if the IBM PC team had picked the Motorola 68K instead of the Intel 8086).
I guess I should have looked for a high end Renedering GPU Chip that would be closer but in most cases their internal structure silicon area is proprietry knowledge so it's not easy to use as a comparitor for what's possible.
"Intel's manufacturing processes are far ahead of the curve compared to FPGA's, proprietary, and unlikely to be released to give them competition."
Yes, but unlike GPU manufactures they give away a lot of data in terms of gate density structures and chip area speed through put etc etc in marketing blurb. Which is why if can be used as a fairly reliable indicator of what is possible.
I'm using FPGA's simply for prototyping hardware, in the same way I used to use reprogramable ROM as state machines and in one case an RF IF (but that's another story altogether).
If you have a look around you will find you can get PCI cards with multiple FPGA's on which means you don't have to even think about picking up a soldering iron 8)
"They're too invested in... ...Looking at... ...NVIDIA parallel chips, I don't see anyone who can sqeeze enough multiplexed functions on a chip to make this design cost-effective."
Look at what Sun did with their RISC CPU and Cray designed multiplexing (Starfire systems such as the 10K), likewise what IBM do in their Z machines.
They are still (maybe in Sun's case) doing it and producing throughput Intel still has wet dreams about. But Intel can not achive because they have gone to far down an evolutionary culdersac of single chip CISC. Then to get performance Intel have had to add all the pipelining, lookahead and hyperthreading etc etc which chews up a lot of silicon for very little benifit.
Intel chips are designed to "appear to be all things to all systems". But in reality they do a limited set of some things very well for "office desktops". And thus perform badly in high end single user systems and low end servers and just cannot hack it in high end servers (look at context switching times and the underlying reasons).
But the Intel chips are extrodinarily cheap for "office desktops" because the overly expensive designs are amortised across multiple millions of chips.
The fact that they do baddly in high end high integration servers is the reason we have gone to clustering. However even here people are looking at using games consoles and graphics cards because the GPU's have way way more performance for properly parellel tasks, not just in silicon area but in bang/watt and bang/chip cost.
Go have a look at some early ARM processors and have a look how Unix did the IPC plumbing etc and have a think on how a simple VM MMU controler can be added on to map "buffers" for doing all the I/O above the hardware layer.
What I have done is move the tipping point by taking a new look from the bottom up. Instead of using desktop mother boards for making clusters I've used FPGA's on PCI cards.
I'm also currently playing (all be it very slowly) with PC104 cards to see if I can migrate up the COTS stack to get similar effects for considerably less cost. There is a reason for this and it is to do with "multi media" some functionality will not go into such limited resources as FPGA's. And like it or not high integration PC motherboards do make cheap thin clients for user I/O work.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.