Botnets Attacking Each Other

A new Trojan Horse named Spy Eye has code that kills Zeus, a rival botnet.

Posted on February 17, 2010 at 7:45 AM • 44 Comments

Comments

Beta • February 17, 2010 7:55 AM

And we’re one step closer to a real malware ecology.

In this case the behavior has an obvious human motive: information I’m stealing is more valuable if I’m the only one stealing it. But it would be a good tactic for a pure software parasite too: a virus is more likely to survive if it goes undetected, and that’s difficult if a clumsy rival is alerting the user to trouble.

The topic of “benevolent worms” has come up before. Mr. Schneier has said that it’s a bad idea to create them, I’ve said it’ll happen sooner or later.

Zith • February 17, 2010 8:00 AM

@Beta: Your second reason was actually the first reason that came to mind about why someone would take this approach. And it doesn’t even need to be that the rival is clumsy as users will eventually notice the lost resources if there are too many competitors.

Romeo Vitelli • February 17, 2010 8:23 AM

And now that we have evolution in action, will the new superbots be more dangerous than before?

wiredog • February 17, 2010 8:32 AM

The early stages of Skynet…

derob • February 17, 2010 8:35 AM

maybe there comes a time that two bots are able to procreate, creating totally new bots… that would be life!

A Nonny Bunny • February 17, 2010 8:38 AM

@Beta
“The topic of “benevolent worms” has come up before. Mr. Schneier has said that it’s a bad idea to create them, I’ve said it’ll happen sooner or later.”

It already has happened. For example, http://en.wikipedia.org/wiki/Welchia in 2003
And it was a bad idea, because it caused a lot of traffic in spreading and downloading patches for fixing the security of the computer it was on.

Angel One • February 17, 2010 8:42 AM

This isn’t really imitating the natural world as much as it is imitating traditional street gang behavior. (If you read the interview with the Nigerian scammer a few posts back you’ll notice that his “organization” is set up almost the same way as a traditional street gang too).

pragmatist • February 17, 2010 8:52 AM

Are governments so different from street gangs either?

User Agent • February 17, 2010 8:57 AM

I noticed that today the blog is blocking requests without the optional user-agent header. Is this planned to be a permanent change?

Winter • February 17, 2010 8:59 AM

This is technology imitating nature.

Most bacteriocides come from germs trying to kill off rival germs. And viruses trying to block infections by other viruses are also well known. As are viruses parasiting other viruses.

Some of the most potent poisons are secreted by bacteria trying to spoil dead meat for scavengers. These are the ones causing food poisoning.

If you create an feeding ground, an ecosystem will develop.

Winter

User Agent • February 17, 2010 9:01 AM

It looks like it was really something else. I had gotten a 403 back, changed the user-agent setting and things worked, so I assumed that was the problem. But when I switched back things were working again.
So I don’t know why I got the 403.

Clive Robinson • February 17, 2010 9:24 AM

@ Moderator,

Are you having “growing pains” as I’m getting the Apache “forbiden” on a daily basis (https pages) at US breki / lunch / tea times.

When I used to run a server I had similar and found it is sometimes caused by resource contention under load (sometimes just by two hits at the same time).

Clive Robinson • February 17, 2010 9:36 AM

So the botnets have stepped it up a notch.

However they still appear to be using vulnerable non covert control and upload channels so they are fairly readily spotted.

Oh a modified version of Zeus had a go at .mil and .gov a short while ago with a payload designed to grab PDF’s and other docs…

The thing is it was successfull against some computers even though they had AV etc software on them.

I think something like one third of commercial AV software missed it even though it is fairly well known.

The simple fact is that a zero-day Botnet with rootkit and uploadable payload covert coms channels is going to be an ideal intel gathering tool.

And saying it came from Russia without further evidence is about as usefull as saying I bank with MegaBank (substitute your local major bank).

That is Russia is a relativly safe place for cyber-naredowells to appear to operate from.

kashmarek • February 17, 2010 10:03 AM

When they have no common enemy to fight, then they fight each other.

paul • February 17, 2010 10:25 AM

So essentially what this says is that we common users have become a substrate for the evolution of higher life forms…

TS • February 17, 2010 11:29 AM

@Clive Robinson

Maybe it’s a rival security blog trying to take out the competition, we readers only have a limited amount of time to read blogs…

Brandioch Conner • February 17, 2010 11:43 AM

Fascinating.

And yet so many people have claimed that you cannot automatically remove a rootkit on a remote machine.

Nick P • February 17, 2010 11:45 AM

@ Clive

Actually, there is a sneaky botnet similar to what you describe. However, it doesn’t need 0-days or even subliminal channels. The bad guys have found that regular, but highly encrypted, malware with semi-covert channels is good enough. Check out Clampi, credential thief extraordinaire:

http://www.pcworld.com/article/169333/massive_botnet_stealing_financial_info.html

Paul Renault • February 17, 2010 12:48 PM

I knew that Core Wars would come back, eventually.

Alan Kaminsky • February 17, 2010 12:50 PM

Maybe all this malware will achieve what Apple, GNU, Linux, and everything else in the world have never been able to pull off: the demise of Windows and Microsoft. It’s gotten to the point where no sane person would even consider running Windows.

Craig • February 17, 2010 2:14 PM

No panic yet, more human motive for financial gain as stated earlier, need to worry once the Trojan has more control over its choice of rivals it kills

Champs • February 17, 2010 2:53 PM

I’m with wiredog, this is dangerously close to the premise of Terminator 3. Let’s just hope that Zeus isn’t a false flag.

Clive Robinson • February 17, 2010 3:57 PM

@ Nick P,

“Check out Clampi, credential thief extraordinaire”

Sadly insufficient information and I suspect on just one reading that the artical contains some incorect information possibly by design possably by miscommunication with the researchers.

I will go have a hunt around 😉

That being said there are a number of ways botnets are spotted.

1, Down the machine and check the files against known good files.

2, Check the files whilst the machine is running.

3, Observe the internal behaviour of the computer.

4, Observe the external behaviou of the computer.

The first works simply because the code has nowhere to hide IFF you check all the mutable memory against a known good copy (however this presupposes good code as a refrence).

The second runs the same way as the first however it has the additional assumption that you have a non rooted monitor that cannot be lied to by the rest of the machine (as we know with root kits this can be a failing with many kernels).

It is the way a number of malware prevention systems work however many are not particularly good as they only recognise “known” malware (and often not very effectivly).

The third method observes the computers working with regards memory access etc. Currently there is not as much information out there as needed to say if this will reach down into the hardware as well. Imagine it as DPA type activity.

By and large most computers have fairly regular signitures where a change can be fairly easily spotted. However it kind of requires the same if not more horse power as the machine it’s observing.

The fourth method is where it’s currently at for many researchers. Basically you have a “honey pot” machine that you observe carefully both it’s inputs and it’s outputs. Changes in behaviour trigger an alarm and type 1 or type 3 investigations follow.

However as I have noted in the past there are ways to enumerate a lot of “honey pot” systems simply because they are virtual and things like timestamps stay in “lockstep” across many IP addressess simply because it is actually just one machine using virtualisation software to pretend to be many.

Timestamp enumeration for instance can be made to appear to be a very hamfisted script kiddy attack. The Honeypot owner ignores it as just an unsuccessfull script kiddy. The covert enumerator however knows from examing the time stamps from the “kiddy scan” that it’s a “honeypot” and marks the IP address range as bad news and leaves it alone…

The real trick for the covert bot net is as I have said for a while is to hide the control and upload comms in what would be seen as normal user activity (such as a Google search). There are ways by which you can use Google to compleatly break the need for conventional channels…

It will be interesting to see what happens in Botnet development over the next year or so…

However I suspect we will only see the bad not the good unless we get type 1 and type 3 searches working properly.

And it’s not going to happen with consumer grade OS’s and hardware, they are designed to be “efficient” and that is often incompatable with security due to various side channels.

David • February 17, 2010 5:23 PM

http://i.imgur.com/TwLkA.jpg

concerned • February 17, 2010 5:45 PM

@ Alan Kaminsky

Windows is attacked because it has market share, not because it is particularly vulnerable.

Success in attacking 100% of Macs would be similar to having 6% success on Windows. ROI for attacking Windows is much higher.

As Windows gets better look for the action to move to Flash and PDF.

Clive Robinson • February 17, 2010 5:50 PM

@ Brandioch Conner,

“And yet so many people have claimed that you cannot automatically remove a rootkit on a remote machine.”

It depends on if you mean a rootkit as a “specific known instance” or a “general class of unknown” software.

In this case it is a “specific known instance” so yes it can be removed because you don’t have to “search for it” (though care would be needed to not crash the system and that is probably the hard part).

In the case of a “general unknown” you have to go search and part of the rootkit class definition is it “hides it’s existance (that is it prevents detection by providing false results to searches etc).

All rootkits can be found and removed if you do not alow the rootkit to start and thereby hide it’s self (the problem is knowing you’ve checked “it all” and that is another issue altogether).

Most rootkits can be found even when running if you can “search underneath” the point at which the rootkit lies to you.

Simplisticaly it’s a bit like phoning a person and their secretary answers and say’s “their in a meeting” unless you can check between the sec and the person then you are going to have to accept what the sec says.

If however the person has a window and you can look into their office through it and see them sitting at their desk, then you have got between the person and the sec, and thus know the sec is lying…

The other way of course is to ask questions where the sec/rootkit is caught out in a deception because their answers are inconsistant. Doing this to find a root kit can be done but it’s a lot of hard work and relies on your ability to find a question that shows the deception (say missing Inodes on a Unix file system, inconsistant used/unused cluster info and file sizes).

It is of course possible to design a “deadman’s switch” root kit. That is it encryptes the user files. If you remove or tamper with the root kit the encryption key is lost, and so are all the files so far encrypted. Thus sometimes it is not desirable to remove a root kit untill you have got all the user files off of the machine and tested them on an entirely different machine.

Rob • February 18, 2010 2:28 AM

@Clive Robinson:
“2, Check the files whilst the machine is running.”

That can be done, even on a utterly rooted machine. If you had access to it when it was in a good state. You know, not only security is hard, protecting malware kits is too 😉

Make a hash of password+the system. Then ask for the hash with a set of random passwords. The machine should come up with OK only for the correct password.

It is easy to make a set of hashes, each time with a different password and store them with random correct/incorrect passwords. At testing time, you should get back a list of PAS/NO PASS, {0, 1}, lines, which make up a binary number which you can remember or write down off-line (eg, 137). The only way to defeat this scheme is to store all tested resources and present them when asked for it.

I tried this idea out with a Python program, just a proof of concept (it is at SourceForge and repo.or.cz).

http://sourceforge.net/projects/signduterre/
http://repo.or.cz/w/signduterre.git

Note that it can be broken if the malware is able to run a simulated system. That is, if it can dish up a copy of every file or service you try to “sign”.

Rob

kangaroo • February 18, 2010 8:42 AM

@Clive: All rootkits can be found and removed if you do not alow the rootkit to start and thereby hide it’s self (the problem is knowing you’ve checked “it all” and that is another issue altogether).

Not true for the general case. Most would be correct — but there can be no universal “rootkit” signature. Universally, you can’t identify anything complex in a complex system — you can’t know the effects of anything for the most general case (without iterating over all possible states).

But practically, of course, that’s usually irrelevant, since you can get almost all cases.

Clive Robinson • February 18, 2010 9:49 AM

@ kangaroo,

“– but there can be no universal “rootkit” signature.”

True,

“Universally, you can’t identify anything complex in a complex system –”

Not quite true, you forgot the word “unknown” prior to either complex or system.

If you know what should be in the mutable memory and why then you can remove anything that should not be there or is not required for how you intend to operate the system.

Even in an unknown system forensic “mutable memory structure investigation” (now given various fancy names) has a reasonably good chance of identifing when and how changes where made.

That is there is a level below which the root kit cannot get and still be able to remain hidden from examination.

For instance on a simple hard drive (FAT16 format for example) if you had “defraged” it prior to the root kit then the position of data within the cylinders, tracks, clusters, sectors and slack space would be effectivly linear and should align with the File Alocation Table. The rootkit would either create a non linear disruption or be “tacked on the end”.

This can enable additions and modifications to be found reasonably easily.

I demonstrated this back in the early 90’s to a bunch of students looking for project work. However the found it “to dull” as a concept…

However I can see their point back then as the effort involved in carrying out such an analysis was usually not worth it (and thus did not get paid for at a lucrative rate if at all). However the new “begger your opponent” legal techneique of “electronic discovery” has put a slightly differant shine on it these days (hence the new fancy names psuedo linking it to geology etc 😉

Brandioch Conner • February 18, 2010 10:58 AM

I wonder why those people who said that this couldn’t be done keep claiming that it is impossible when it is already being done “in the wild”.

Clive Robinson • February 18, 2010 11:20 AM

@ Rob,

“Note that it can be broken if the malware is able to run a simulated system. That is, if it can dish up a copy of every file or service you try to “sign”.”

The thing abot “simulated systems” is they are not perfect…

Thus things like response times can also be measured to see if the system is real or simulated.

Checking the timing is not perfect but if the rootkit designer has not built it in you know that something is wrong.

A system such as this used to be used to stop CD-ROM games being coppied to hard disk and run from there.

It worked fine untill some people realised what was going on and built a simulator to make the HD response times look like that of the CD-ROM.

All simulated systems have achelies heals the question is can they be identified in a sufficiently reliable way.

From the rootkit writers point of view there is only so much they can simulate before the system response times become such that things don’t work the same way.

For instance I’m aware of a piece of Unix malware (yes it does exist) getting found simply because it changed the system charecteristics sufficiently that initd got it’s “nickers in a twist” 😉 A service that had previously started before another due to Hardware latency and resource issues did not start in time due to the malware using resources and due to poor script writing by a sysadmin it caused a process to fail. The subsiquent investigation revealed the malware.

In the case of the Sony rootkit it likewise got found out due to changing the behaviour of a machine. The first indication was a researcher noticed it did an ET, and subsiquently found other effects on the effects on the CD and HD.

It is knowing this that alows signitures not just of the files but the system activity thus enabeling changes to be detected. A very early example of this was with the ICL George Operating system. It was a batch system, and the operators used to listen to some asspects of the system on a speaker (built into the Op Pannel). The warbaling used to alert them to changes that indicated bugs etc. There is a story about an AM radio at MIT being used for the same purpose.

kangaroo • February 18, 2010 2:48 PM

@Clive: That is there is a level below which the root kit cannot get and still be able to remain hidden from examination.

No — there isn’t such a level “in general”. “In usual” there is.

Take your FAT16 example — for most cases (almost all), that’s true. But it’s not necessarily true — a super-genius rootkit could continuously defrag itself.

That’s the problem with general computational systems — it has states that are provably unpredictable, except by full iteration.

It’s kind of a pedantic point, I know, but I’m a pedant. When I say “all” I mean the mathematical operator.

So, even with a “known” system and a “known” rootkit, if the rootkit is capable of general operations, it could be there without you knowing it (extremely unlikely). With an unknown rootkit, it could be there at a less unlikely (but still very unlikely) probability. With both unknown, you start to reach the level of it not being terribly unlikely that you can not identify the rootkit.

It’s like proving that a system has no “bugs” essentially. For a general computer without some special assumptions, you can only do it by going through all states. A rootkit is essentially a bug.

Brandioch Conner • February 18, 2010 3:35 PM

@kangaroo
“So, even with a “known” system and a “known” rootkit, if the rootkit is capable of general operations, it could be there without you knowing it (extremely unlikely).”

And, as demonstrated in the article that Bruce posted, this is sufficient for real world work.

Clive Robinson • February 18, 2010 5:18 PM

@ kangaroo,

“It’s kind of a pedantic point, I know, but I’m a pedant. When I say “all” I mean the mathematical operator.”

This could prove interesting then 8)

One of the (so so) excepted attributes of a rootkit is it survives a full power cycle of the system.

IFF you agree with that then you must have some part of the root kit in a tangable form in semi-mutable memory on the system.

That is when the system is powered up there must be information (code) retreived from the semi-mutable memory WITHIN the system to activate the information that forms the rootkit (agreed?).

If not then the machine does not have a rootkit on it at powerup (agreed?)

Thus the rootkit can be said to have a tangable pressence on the system.

On a “known” (that is known to be good) system this information in tangable form will be in addition to that, that is “known”.

Because it is in tangable form it is using some of the systems resources whilst powered down.

It also cannot be hiding it’s self when powered down.

Thus examination of all the semi-mutable memory on a powered down “known” system will be able to find any additional or modified resources.

Thus an examination (by an imutable check agent) of the information in semi-mutable memory as data and not code will show where it has been changed and how.

Thus there is a level below which the rootkit cannot hide it’s self as it has to be available (in part) to be seen by the CPU as code on powerup.

So a scan of the memory before powerup compared to a scan of the memory after it is powered up will show a difference if the rootkit is hiding it’s self.

Now I would agree there may not be much you can do with the information IFF a difference is discovered but it is an indicator that something is unacountably different.

At which point you may simply oblitarate the contents of the memory concerned and re-load it with known good information.

I would however agree that you cannot ask the system it’s self to do the scan as there is usually semi-mutable memory within a modern CPU that does survive a power cycle.

Which gives rise to some issues if the design of the CPU is such that it’s semi-mutable memory is not verifiable via it’s pins.

So on a “known” system there are ways by which a rootkit or other unautherised change that has been made can be detected.

The problem is of course that the imutable resources required to do this put some interesting constraints on system design.

You will note that I’m avoiding the issue to do with the “known” system image actually containing a rootkit.

Thus if your instalation media comes with a built in rootkit, tough you are owned and there is I would agree little that you can do IFF the rootkit has been put in correctly.

However if we are going to discusse this in any real depth we do need to start nailing down meanings or we could easily argure at cross purposes.

Jay • February 18, 2010 6:03 PM

@Clive

Sooner or later, inspecting all the ‘semi-mutable memory’ on the system will require desoldering the BIOS Flash.

I figure the only reason we haven’t seen BIOS rootkits is a effort/risk vs. reward thing – it’s not worth it yet to make your rootkit practically undetectable…

Clive Robinson • February 18, 2010 7:14 PM

@ kangaroo,

I think for the sake of the Moderator’s blood presure I should explain something.

We are both right and both wrong….

Which is why I said,

‘This could prove interesting then 8)’

We are actually talking about to seperate things, that look very similar.

You are saying that “inside a system” the system cannot prove it is not infected (Alan Turing’s halting and Kurt Godel’s undecidability theorms).

I’m saying that by “being outside” the system you can detect unautherised changes made to the system (by malware etc) by simple image comparison.

As Brandioch and I have crossed foils over this in the past it needs a little further explanation for others (especialy as some may think I have changed my position).

There is a problem with Alan Turing’s little state machines that is not talked about very much.

They process “information” on a tape (memory). The information can be used as data or code or both.

The issue is there is no segregation between information that is data and information that is code.

From a security asspect the Universal Turing Machine is a foot with both barrels of a loaded shotgun with a hair trigger pressed very firmly up against it.

However the problem is a little deeper than that.

Broadly there are two types of computing engine.

1, those that change data but do not react to data

2, Those that change data and do react to data.

The difference is not immediatly obvious but has significant implications for what can be done with data and security.

A Digital Signal Processor is an example of the first type of computing engine. It reads data in and performs a set of instructions on the data such as multiply by a constant and add a previous value and outputs the changed data.

The data it’s self is never looked at by the DSP program, thus it is not working as a Universal Turing Machine because it’s behaviour is not modified by the data therefore there are certain things it cannot do (branching on compare).

From a security asspect such a computing engine is highly desirable as you can prove exactly what it will do under all circumstances.

Of the other type of computing engine there are again two basic types. Those where instructions and data are on seperate tapes (Harvard) and those where they are both on one tape (von Neumann).
From a security asspect the Harvard architecture is preferable to the von Neumann. However for many reasons the von Neumann architecture is prefered for “general computation”.

From a security asspect the “Pure Harvard” architecture has the advantage that data can never be used as code. Although data can change the flow through the states of a program the data cannot add new states either deliberatly or accidentaly.

Thus with a “Pure Harvard” architecture any insecurity has to be “built in and excercised by the data”, the data cannot add insecurity.

From the security aspect the von Neumann architecture is Argh…

kangaroo • February 18, 2010 7:41 PM

@Clive:

Yes, of course, you can use a machine that is not a general computing machine.

But, but, but for the non-DSP case — making code immutable doesn’t solve your problem. What you need to do is not only make code immutable — but remove any possible infinite loops, which requires as well making any branching decisions be to code that itself can’t make arbitrary branching decisions. Aka, no “interpreters” exist on the machine — which is not generally acceptable, since that eliminates most software programs, reducing all programs to the “DSP” class.

That is the little “Goedel” issue. Russell tried exactly to separate “data” from “code” to avoid infinite regress — and Goedel found that you can get around it, for the general case, no matter how hard you try.

Now, with a few assumptions, you can get around it — as you say, immutable code AND a hardware controlled offswitch (or a limited memory that is tractably iterated). Then, you can’t eliminate rootkits — but they can only survive temporarily and be predictably eliminated.

So, it depends on what your goals are — for a server, an offswitch is a bad assumption. For most general computers, immutable code is a bad assumption.

Rob • February 19, 2010 1:15 AM

@Clive Robinson:
“The thing abot “simulated systems” is they are not perfect…”

My point completely.

The idea behind the signing is that “Security is hard”. Hard for the defender, but also hard for the attacker.

In most security situation, the defender must assume the attacker has more resources than “herself”. In this situation, it is the Rootkit that has less resources than the legitimate defender.

The rest of the discussion is about the fact that a rootkit theoretically could simulate the whole computer, say, by running it in a virtual machine with appropriate hooks.

In a “normal” security analysis you would indeed investigate this probability, as the attacker might indeed have the hardware to do this.

But YOUR computer is all the attacker has. It has to fit in limited storage, with limited power. So there will ALWAYS be operations you can ask it to perform that it will not be able to complete.

So, just make sure you ask a lot of unpredictable “questions” using a wide range of system commands and programs. And look at many different places.

The seminal paper on this would be “Reflections on Trusting Trust.”
http://www.schneier.com/blog/archives/2006/01/countering_trus.html

Note that David A Wheeler has completed his PhD thesis on how to break the chain:
http://www.dwheeler.com/blog/2009/11/29/#trusting-trust-success

Rob

Clive Robinson • February 19, 2010 6:01 AM

@ Rob,

“Note that David A Wheeler has completed his PhD thesis on how to break the chain:”

This I very much look forward to reading 8)

@ kangaroo,

“So, it depends on what your goals are”

Yes but as importantly how you define them, view them and most importantly implement them.

For instance you say,

“– for a server, an offswitch is a bad assumption. For most general computers, immutable code is a bad assumption.”

Which in a 20,000ft view I would agree with,

But these are end point conditions and in one case we know dosen’t exist (server never off).

Importantly recognition of this has enabled us to get closer to the desired state (server never off) by using other tricks and tradeoffs and thus doing an “end run” around the issue usually by making things worse…

In the case of “server never off” it is a trade on Mean Time To Fail (MTTF) and Mean Time To Repair (MTTR), and putting parallel systems in place to make the trade work.

!!! Warning long dull explaination !!!

For those that don’t know Military systems have high “availability” figures that usually cannot be met by single components.

What engineers did was change the design goals.

Achiving MTTF figures is a game of diminishing returns on a single component. That is the cost of designing a single component that was twice as reliable was more than twice the cost at some point on the curve and starts getting exponentialy worse there after.

The solution is to use two or more components in parrallel thus if one fails you suffere a decrese in performance untill it is replaced. You then design the system to have a performance that works within specification with that decrese.

You then design the components in the system to be easily replacable thus vastly reducing the MTTR. The problem with doing this is it also brings the components MTTF down as well which appears to be counter intuative.

But the decrese in MTTF is considerably less than MTTR. Thus there appears to always be a “sweet spot” at which a trade can be made for the number of components you put in parellel

So you can make things better by making them worse 8)

Eventually you design a system to automatically detect and swap over failed components (fail over) the swap spares being kept in “hot standby”. You then design the system such that components can be “hot swapped” that is removed whilst still in operation for “Planed Maintenance” etc.

However the curves of component life times is complex (so called bath tub curve) and this makes the problem somewhat more complex.

!!! End of long Dull…

But getting back to your points,

“But, but, but for the non-DSP case — making code immutable doesn’t solve your problem. What you need to do is not only make code immutable — but remove any possible infinite loops”

Sorry I forgot to mention loops this time around but as Nick P knows I’ve mentioned them before along with both positive and negative, feedforward and feedback (not the best terms to use but they surfice).

Yes “infinate loops” in changes of state have to be removed and be replaced by “chains” of states and of course “halting” has to be enforced within a state as well.

Which does put a limit on what you can do.

However I need to be carefull with saying this point 😉 you don’t have to “remove” data dependent branching or looping or non halting from your system for security.

What you have to do is “prevent the detremental effects” of “state looping” and “state halting” for security.

That is you are doing what humans do best taking an “end run” around the problem (it’s how Monte Carlo methods started at Los Almos).

Which brings me back to,

“Aka, no “interpreters” exist on the machine — which is not generally acceptable, since that eliminates most software programs, reducing all programs to the “DSP” class.”

First off this is not always a problem for security as we have both noted.

But secondly whilst True (in the mathmatical sense) for a self contained systems, it’s not of necesity true of a system with a monitoring agent IFF you are prepared to accept a loss of functionality in return for control.

And this is where it gets very messy (think of it as the start of “Monte Carlo method” of security).

We know that in the mathmatical sense we have issues with both Turing’s and Goedel’s problems on “general computing engines”

As you put it,

“That is the little “Goedel” issue. Russell tried exactly to separate “data” from “code” to avoid infinite regress — and Goedel found that you can get around it, for the general case, no matter how hard you try.”

The point is that what Goedel, Russel and Turing where trying to do does not directly apply to security.

We know in very real terms that security can never be absolute. However the Universe is finite thus we can put security beyond that finite bound in various ways. That is we perform an end run around one particular aspect (brut force) of the security problem.

Kurt Goedel’s theorms show that ultimatly we can not determin some things. But do we need to determine that an action is actually insecure?

Now we can say on “probability” it is and therefore take action on the assumption it is insecure. Thus effectivly step back from Goedel’s theorms by “erring on the side of caution”.

To do this we have our general purpose computing platform doing it’s thing.

However we control what goes in and what comes out at the interfaces.

And we can design an imutable system with branches to do this, provided the branches always result in the “stop” state. We can also alow it to infinatly loop provided we check it is still moving from state to state in the correct sequence.

However as you will point out this monitoring suffers from the “lesser fleas” problem of infinate regretion which whilst true is something that is one of those issues you can end run arround with “Stop”.

It is not perfect but as noted by Rob above it’s our system thus,

‘But YOUR computer is all the attacker has. It has to fit in limited storage, with limited power. So there will ALWAYS be operations you can ask it to perform that it will not be able to complete.

So, just make sure you ask a lot of unpredictable “questions” using a wide range of system commands and programs. And look at many different places.’

Hence welcome to the world of “probablistic security” and say goodbye to the world of “determanistic security”.

(Hence my Monte Carlo comment).

But it actually gets better do you remember my comments about MTTF and MTTR?

Well the same applies,

If you put a lot of little systems in parellel thay can each do a piece of your task. The attacker however has to do his compleate task in every one of those little systems and at some point he can not do it…

That is you starve the attacker of resources to do what they want whilst still acomplashing what you want.

Now it you’ll excuse me for a while I have a hefty paper to download and read.

@ Brandioch Conner,

I hope this goes some way to explaining why a single von Neumann system will always be vulnerable but a number of limited systems in parellel with appropriate watchdog may not.

Nick P • February 23, 2010 3:03 AM

How did this discussion take off without me? Well, I’m reading this under the positive[-feeling] influence of vodka and I’m thinking: OMG! way too much abstract stuff! It reminds me of Clive and I’s previous discussion on high assurance hard drive encryption scheme, except much less concrete and hence less productive. Let me try to tackle a little bit of these issues from a different, more concrete perspective.

For rootkits, we are looking at three things: prevention; detection; removal. A lot of talk here centers around detection. There are many methods, but most compare a known correct state to an incorrect state to determine all is not well. They then assume (or argue for) the existence of a malicious rootkit. I say if your at the point of requiring many detection methods and removal is a chore, you’ve already failed at the design state. So, let’s look at prevention as it relates to what people are saying.

@ kangaroo

Oh sure, we can look at things pedantically and use pure mathematical operators, but this isn’t the world of mathematics. That’s just a deterministic abstraction of a chaotic, messy reality. Some theorems say I can’t predict general purpose systems without proving all their states, but if I control the inputs I beat that entirely. Likewise, I can limited it just enough that it’s general purpose enough, but not ‘all’ in the mathematical sense, to do useful work that can be verified formally. Formal verification has pitfalls, as Guttman demonstrated amply. However, it’s been extremely useful when its weaknesses were considered and systems were developed in a way that suits analysis. The point you may ask? F*** math and its abstract rules, theorems, etc. It’s a tool, not a concrete reality. We should use it where it’s practical and work around it otherwise. We have repeatedly in the past and we shall in the future.

@ Clive

Well, I think that von Neuman architecture is inferior but can be secure. It requires trusted boot and trust in these components: TCB hardware; firmware; bootloader; kernel-mode stuff; drivers (usually); platform support services in TCB. There are systems that demonstrate this for the most part: Honeywell’s SCOMP; GEMSOS on x86; INTEGRITY-178B on Rockwell Collins’ verified AAMP7 processor. If you have a trusted boot process and correctly verify it (e.g. formally with enough granularity), then you have a clean start. If the TCB prevents rootkit infection, then rootkits are no longer an issue at kernel or OS level. This means kernel- or OS-level apps can be used to monitor for violations in user-mode and non-TCB apps. Yeah, this is the detection stuff I said is a fail, but it’s a failure in the lesser stuff. At this point, we at least have the [ineffecient] method of destroying a rogue process and restoring from clean state. Can’t do that in traditional OS’s with insecure TCB’s.

So, we have a clean start. What next? Well, I like the virtualization concept. We create a trustworthy emulation layer that isolates untrusted legacy OS and app code. OK Linux on OKL4 and Green Hill’s INTEGRITY Padded Cell are the best examples I’ve seen. They start with a [semi-]trustworthy platform and highly verified microkernel OS, then they run guest OS’s and apps on a virtualization layer. The virtualization layer is in user-mode, as are drivers, and both are totally subject to kernel-level security policies. This has many implications for rootkits. They are easier to detect, isolate, purge, and prevent. One thing academics haven’t explored is that this makes it easier to do the security by diversity concept. You may have several different components redundantly doing a security-critical activity, with a trusted voting component producing the final result. The strong TCB almost guarantees that the compromise of one won’t hurt the rest and the overall system is still safe.

Random Stuff Follows

The main problem with von neumann and harvard architecture is that there is no central security enforcement mechanism for even local system activity. How? >>>DMA<<<. This performance enhancer bypasses the CPU that holds the security kernel and defeats the reference monitor concept totally. A high performance version of PIO would be nice.

SecVisor is a nice little recent tool that sits in the root mode of an Intel VT-capable processor and monitors the kernel. The tool is TINY, formally verified, and currently detects/fixes changes in Linux kernel. It would be a start on producing a trustworthy kernel-level rootkit prevention/monitoring tool that itself couldn’t be subverted.

I think microkernels are going to be a big problem solver here. Microkernels let us analyze things in an isolated and connected way. We can build systems bottom up, easily restrict information flows to what’s necessary, and replace faulty/subverted components on the fly. QNX Neutrino already does this to a degree with its automated restart of critical kernel services to achieve high uptime. I’d like to see a similar mechanism for integrity on a high assurance microkernel like, err, INTEGRITY. High assurance microkernel plus medium assurance virtualization layer plus decent OS plus isolated security-critical code = good enough 99% of the time. Some say, “What about the other 1%?” Hell, let’s try to get the first 99%, first, then worry about a 1% market. 😉

Brandioch Conner • February 23, 2010 11:56 AM

The most amusing thing is how many people don’t have the faintest idea of what they’re talking about with regards to mathematics.

Mathematics can be used to model Reality. But mathematics cannot define Reality.

If Reality contradicts the model you have, then the model is incorrect. And no matter how much mathematical jargon you throw at it, you will not change that fact.

Clive Robinson • February 23, 2010 12:43 PM

@ Nick P,

“How did this discussion take off without me?”

Well it could have something to do with other more popular posts pushing it out of the 100 new comments list in less than 24H (which is why I sometimes mis replies to my comments).

“Well, I’m reading this under the positive[-feeling] influence of vodka and I’m thinking: OMG! way too much abstract stuff!”

Yup but some people will look at your post and think,

‘OMG! way too much technical stuff!’

Just kidding 8)

Kangaroo is correct in making the point that theoreticaly a single general purpose system cannot do what we want it to do. Due as we have said to the little problems of Turing and Goedel.

But as I also said we humans not being constrained by such mathmatical proofs do find end runs around them.

And my viev point is that yes a single general purpose system is going to be vulnerable and no there is nothing we can do to stop it happening by working within that system.

That is we need to step outside the system and look at it’s operating signitures with another system.

The question then arises does this second system have to be,

1, General Purpose.
2, Mutable in operation.
3, Accessable from the first system it is monitoring.

And the answer to all three is definatly NO to the first two and should not for the third. Depending on your view point the second system can see the data on the first and if the second system alows it then this could effect the behaviour adversly of the second system.

Which then brings forth two more questions.

The first being is this second system still subject to the issues arising from Turing and Goedel’s little problems. And the answer is probably not.

The second is what effect does this have on the first system.

Well to answer that you would have to look at how it works.

In a simplistic case it would halt the first system and check every piece of mutable memory held the correct values (how you know what they are is not of relevance at the moment) and therefore the first system is still “correctly known”.

Thus the second system time (SST) would take a time slice out of the first system time of operation thus you could have a first system availability (FSA) calculation of,

FSA = FST / (FST + SST)

However this availability figure is granular in nature. That is say SST was 0.1% of the total over 24Hs or ~86 seconds. This begs the question what happens in the other 23Hs 58mins.

Thus you also need to work out a minimum window time (MWT) between the second system monitoring runs. Obviously MWT should be sufficiently small as to reduce the effectivness of any untoward activity on the first system.

Thus you start looking at the probability of harm on the first system. In a different way.

The reason being that as you cannot do a simple check as frequently as required to prevent an attack and still get usefull work done.

The first thing that comes to most peoples mind is to do delta type checks on the file system. That is once it’s in a known state you keep an eye on what gets changed/accessed by the first system. This leads on to the notion of operating signitures etc.

Thus you have moved from activly getting into the first system to looking at what it is upto.

Which eventualy gives rise to thinking about the relationship between the two systems and what constitutes a system.

In a single system view point the CPU is king of all within it’s perimiter this is the Castle mentality system. The problem with it is that it requires “trust” where trust is not possable.

However in a system with multiparts that you watch over you get the idea of each sub system is effectivly a prisoner in a cell.

From this new view point you actualy realise you don’t realy care if you trust the general purpose computers (GPC’s) or not. You just have to worry about what goes in and what comes out.

Thus you can have I/O signitures for all of the programs etc.

You can follow this logic all the way down and partition programs to run on diferent GPC’s.

This get’s you into an interesting state where programs become programlets and then just functions. You effectivly script the runtime together with mainly trusted component parts and a few untrusted parts.

The untrusted parts run in a prison cell and have no knowledge of the rest of the system. A prison warder provides inputs and takes away outputs.

It is very similar to the idea of hypervisors and minimal kernals just more so.

Effectivly splitting a program up this way alows each part to be “known” and thus decreasing the level of trust required for it.

The other thing is it alows a minimum of resources to be given to each part of the program. Malware that hides needs lots of resources, if it can not get them then it can not hide…

Nick P • February 23, 2010 4:31 PM

@ Clive

Well, your coprocessor idea is interesting. Indeed, something similar was implemented for running integrity checks on main memory to identify rootkits. The problem that I see is that, if you don’t trust the TCB, then you now have two instances of the same problem. Additionally, relying on IO signatures has the same pitfall the antivirus and IDS industry currently has: the IO signatures of malware can be made indistinguishable from innocent traffic. If the communication protocol and grammer were designed to facilitate signature-based detection, then I could see it working. But they aren’t and you still have attacks on the coprocessor being feasible. I’ve yet to software like this have no flaws. It’s a combination of hardware, firmware and/or software that does pattern-recognition and interacts in a distributed system. That’s a recipe for trouble.

The other idea you had, of partitioning the program into pieces, is basically MILS architecture. I agree with you on that one making it easier to prevent flaws: I’ve been promoting it for months. In fact, MILS architecture is the main source of my one-processor Castle mentality. With verified trusted components, mainly processor, firmware, drivers and MILS kernel, one can prevent all kernel-mode exploits. From that point, it’s on to defending userland. One can use a combination of MILS, verified components, and a careful security policy to make for a high assurance system. Anyone who doubts that the castle mentality can work should look to XTS-400 and GEMSOS-based Blacker platforms. They are very layered, but follow the centralized castle approach for client-side security. The distributed systems that are built on them assume the clients function correctly & are secure. Point being, even if we want to use your coprocessor, we still have to build it like a fortress using COTS hardware components. It might be easier than a GPU, but it has the same issues.

Clive Robinson • February 24, 2010 6:40 AM

@ Nick P,

“Well, your coprocessor idea is interesting. Indeed, something similar was implemented for running integrity checks on main memory to identify rootkits.”

That’s a small part of what the idea can do. That is once you think “I’ve got to do this, thus what other benifits do I get” you suddenly see a whole load of rather interesting ideas spring up.

“The problem that I see is that, if you don’t trust the TCB, then you now have two instances of the same problem.”

Ahh my lack of explaining. The idea is you have a TCB that is a hardware hypervisor etc. Or more traditionaly it runs the schedular and security aspects of a more traditioonal kernel. Thus the Prison Gov trusts (to a certain extent) the warders but not the prisoners in their jail cells.

“Additionally, relying on IO signatures has the same pitfall the antivirus and IDS industry currently has: the IO signatures of malware can be made indistinguishable from innocent traffic.”

Yes and no. The problem with AV is it is looking at only inbound signitures, not signatures within the system.

That is if you have a function that loads a file of a given size from a given location, you know fairly well what you expect to see on the IO buss in terms of sectors accessed and at what times and in which order. Anything outside of that signiture should be flagged up as suspicious.

Likewise you usually know how long a program takes to load and in what order various parts of it are called. Anything outside of that signiture should be flagged.

The problem with most programs is they are complex and have complex signitures.

But… those complexities are based on simple signitures. If you split the program right down into constituent functional and IO parts etc you can have a more generic signiture for each functional part.

Thus you are not looking at a single complex signiture that malware can easily hide in you are looking at relativly simple expected signitures that any malware would find difficult to hide behind.

“If the communication protocol and grammer were designed to facilitate signature-based detection, then I could see it working.”

That’s the whole point defeate a single almost impossible to deal with complexity by breaking it down into many well understood and well charecterised simplicities instead.

“But they aren’t”

Currently that’s the point of doing the design to see if the extra work required to implement them is worth the security gain.

That is make a trade, a small reduction in efficiency to gain a lot security 😉

“and you still have attacks on the coprocessor being feasible.”

Yes and no. It depends on what the co-processor does and how.

If it is effectivly immutable and looks only to see if a signiture is within bounds, and if not fails safe, then it is effectivly a very simple state machine that can be fully modeled. Thus all the malware can do is trigger it and cause a DoS.

Likewise if you are running systems in parellel to look for diffs, it is effectivly a simple vote system that goes to fail on significant differences (The NASA ok to go type system).

The point is to make the warder as simple as possible and go for “lock down” at the sign of any odd behaviour from code in the jail cell.

“I’ve yet to software like this have no flaws.”

Which is why you make it as simple as possible to do the job (complexity is rarely security friendly).

“It’s a combination of hardware, firmware and/or software that does pattern-recognition and interacts in a distributed system.”

Yes it is the difference being unlike the normal systems to do this you manage the complexity by fundemental design criteria.

Not as a bolt on after thought on systems that are not designed to support it which is currently the norm. And as you say,

“That’s a recipe for trouble.”

“The other idea you had, of partitioning the program into pieces, is basically MILS architecture.”

Only more so. If you use the ideas of parellel computing to break your program down into very small functional units. You can more easily quantify what is and is not acceptable functioning.

“I agree with you on that one making it easier to prevent flaws: I’ve been promoting it for months. In fact, MILS architecture is the main source of my one-processor Castle mentality.”

The trick I’m promoting is to still have your castle to the outside world, but not make the mistake of trusting the slaves that do the work, just the warders that watch over them. Specifficaly you design the system like a prison not a castle.

“With verified trusted components, mainly processor, firmware, drivers and MILS kernel, one can prevent all kernel-mode exploits.”

I’m not sure you can 99.99% yes but at some point you have to say “I need real segregation” because as you noted some hardware (DMA & VM controlers in particular) can do an “end run” attack in a single processor space. It’s an efficiency security trade off that is not worth paying the price for.

“From that point, it’s on to defending userland.”

The idea of the system is it is not just kernel and userland, it’s much more structured than that

Look on it as,

1, Prison Gov
2, Gov’s deputies
3, Block warder
4, Floor warder
5, Trusties
6, Prisoners

A traditional system would only have two levels kernel and user with a hard trust transition, which is very brittle.

The idea is to have compleatly untrusted but heavily controled code at the prisoner / trustie levels rising through the levels to full trust and little controle at the Gov level.

“One can use a combination of MILS, verified components, and a careful security policy to make for a high assurance system.”

Yes this is the same but more so, the idea is to also be able to use untrusted and unverified components in a controled and monitored way.

“Anyone who doubts that the castle mentality can work should look to XTS-400 and GEMSOS-based Blacker platforms. They are very layered, but follow the centralized castle approach for client-side security.”

Yes they do work but generaly not down to the point of being able to run untrusted/unverified code.

The big bottle neck on systems development it programer time.

It has been obvious for many years that the “Unix Method” of components and scripts beats the “All in one Method” of compiled code in all but two areas,

1, Efficiency
2, Performance

But in a modern environment neither of these are actually that important and are becoming less so with time (think Cloud computing for instance).

The real problem with the Unix Method of “components and scripts” in a security environment is trust and verification.

On a unix system the command line tools that can be scripted are not what they once where. They have become overly complex and thus are difficult to verify and turn into trusted components.

The simple idea I had was to minimalise them and put them in a framework whereby they could be more effectivly controled.

That is the framework they run in provides the “instrumentation” by which they can be monitored effectivly. The build process actually produces two pieces of code. That that does the work, and that that builds the signature to pass on to the control side. The work part is “general purpose” and therfore insecure by definition, the signature part is not “general purpose” and can be secure by definition.

Likewise this duality goes on up through the script building as well.

Thus the “verification” process is built into the framework and the components don’t need to be.

This should reduce the verification bottle neck considerably as it removes the current formal verification issues.

“[The] Point being, even if we want to use your coprocessor, we still have to build it like a fortress using COTS hardware components.”

That’s the point you don’t have to have the jails built like fortresses, just solid walls and bars on the door and windows.

If you use say PC104 cards as the jails you can connect them across a PCI or SCSI coms bus in a way that can be effectivly controled by another PC104 card acting as the hardware supervisor.

Thus your jails have CPU memory and a comms bus nothing else. All the code communicates with an external comms stack (think Unix Streams as a model).

If you have to use DMA it’s under the control of the control framework not the jail CPU.

Thus you starve the jail of the resources to hide behind. Things like library code and executanble code is made read only to the jail CPU and thus immutable to it call stacks etc are protected, passed values are not passed in CPU registers only on the stack. And the CPU can be halted and release it’s bus so that the memory can be checked (via DMA) etc.

The idea is that if you accept that no “general purpose” code can be secure you find ways using to make secure (state machines) parts that supervise.

At the end of the day other than resource usage it does not matter if malware runs on your system, IFF there is nothing it can do to communicate data out of the jail. Likewise malware will get removed each time the jail is “reloaded” with a new function and that can be done quickly and easily by DMA.

It is not actually that much less efficient than other semi-secure systems, but is one heck of a sight more secure.

It also scores highly on availability as well as on reliability (as you would expect).

However it does not work with standard off of the shelf code. It is much like any other parrellel processing cluster just with a security framework.

So from that point of view it is not currently that usable for end users nor those that want to use off the shelf closed source code.

But as a test bed for the way to go it is most definatly interesting especialy as cloud computing to be effective is going to have to be done in a massively parrellel form…

Botnets Attacking Each Other

Comments

Leave a comment Cancel reply