Underhanded Implementation of RC4

A runner-up in last year’s Underhanded C Contest was a flawed implementation of RC4 that eventually just passed plaintext through unencrypted. Plausibly deniable, and very clever.

The other winners are also clever.

Tags: cryptography, RC4

Posted on June 20, 2008 at 6:56 AM • 29 Comments

Comments

Clive Robinson • June 20, 2008 8:38 AM

Not the only way to be underhanded with RC4 or any other stream cipher and Coding review bosses that do not know much about being sneeky 😉

I once wrote a stream cipher system that leaked the key in the parity bit of an ASCII Octet.

The main trick involved using a minor problem with most C compilers libs in that when you get memory for a buffer it almost never gets cleared befor or after use.

So if you have a sub in which you create a buffer to get/build the “key” in then before returning you free the buffer all looks well to most code reviewers…

However in most C compilers the lib implementation for memory heap handaling does not effect the memory contents (efficiency strikes at security again 😉

Therefore the original buffer contents stay in memory (heap) untill the programer either makes a mistake (freed memory reuse) or requests another piece of memory that is the same size or less than the original buffer size.

Therfore (easiest way) if you then ask for exactly the same amount of memory you get handed the same block of memory back with it’s contents unmodified.

Which means that as the original buffer held the “key” you have just passed it off by “sleight of hand” into another unrelated buffer (Opps 8)

The hard part was coming up with a convincing argument to modify the parity bit “randomly” that the code boss would swallow without question (hint show them lots of stuff on breaking LRNGs by examining the output and pointing out that the parity bit effectivly is known output from the stream generator)

Oh the reason for using the parity bit, most coders never check it, they just mask it off, therfore even when leaking the key it remained compatable with an older existing implementation (regresion testing can be a malware coders friend)…

The whole point of the excercise was to show to another party that unlike the coding bosses assertion proper “security reviews” where not the same as “code reviews” and one heck of a sight more important.

Clive Robinson • June 20, 2008 9:14 AM

@ Bruce,

I had thet deja feeling when typing the above and sure enough you posted about the “Underhand” contest back on 22 June 2005 where I made a similar comment to my above…

I wonder if the contest organisers read your blog 8)

P.S. Have a look at http://www.tinyurl.com/6498tk definatly one for the (snake) oil investors 8)

SteveJ • June 20, 2008 9:18 AM

@Clive:

And even if you memset your block to zero before freeing, what proportion of code reviewers actually notice if the “value” and “num” parameters are the wrong way around?

Davi Ottenheimer • June 20, 2008 9:22 AM

There’s something odd about the tone of a site that suggests people need to compete in order to make bad code look innocent.

“In general, the fewer hiding places, the more impressed we will be if you can conceal malicious behavior.”

As if there were some kind of shortage of bad code that conceals intent? I thought that was the majority of bugs in production…

Dimitris Andrakakis • June 20, 2008 9:27 AM

@Clive:

Wow 🙂

“Solution to a Worldwide Digital Epidemic:” (!!!!!)

“[…] iGarde Disc is digital security that far surpasses any protection on the planet today and is the holy grail to the multi-billion dollar problem of piracy.”

Rich Wilson • June 20, 2008 9:38 AM

“iGarde Disc utilizes a multi-randomized, polymorphic quantum physics encoding ”

Thanks Clive- I just got cereal on my monitor.

1915bond • June 20, 2008 11:43 AM

“The Competition: Although the computer, video game, film, music, and software industries are highly competitive, 2GeeksinaLab, Inc. has engineered technologies that are years ahead of the competition. According to recent sales calls that demonstrated our key technology – 2GeeksinaLab not only has no real competition, but we have managed to produce technology that is years ahead of current offerings.”

http://www.2geeksinalab.com/Investors.htm

LOL

Rich Wilson • June 20, 2008 11:55 AM

I think Bruce should extend a special invitation to 2geeksinalab to enter next year’s movie plot contest.

Anonymous • June 20, 2008 12:20 PM

http://www.2geeksinalab.com/customer.htm

Charles Decker • June 20, 2008 5:33 PM

“Completely hardware & Operating System independent, this version protects your content without the device, BIOS, or operating system even aware that GLOBAL GARDE™ is/has run. ”

I’d like to see that happen. Then I’d like to reverse engineer it. Then I’d repurpose the code, and become a billionaire after I let it loose in the wild.

/sarcasm

Andrew Garland • June 21, 2008 3:38 PM

The explanation given at The Underhanded C Contest for what is wrong is:

“The XOR-swap trick is an example of coders being too clever for their own good. Here, it gradually poisons the RC4 permutation with zeroes, until eventually the plaintext is just passed along in the clear. ”

I confess this puzzled me, because I know C and couldn’t see the problem. After a while, I understood. So, I’m providing my explanation, with the zeal of the newly enlightened. Have my apologies if I am just dense.

If A and B are different memory locations, then
A = A XOR B; B = A XOR B; A = A XOR B
works to interchange the values of A and B without using any extra memory.
Begin . . . . . -> A = 1010 B = 0011 A = A XOR B -> A = 1001 B = 0011 B = A XOR B -> A = 1001 B = 1010 A = A XOR B -> A = 0011 B = 1010 Great

But, in the algorithm, A and B are the same location 1/256th of the time Begin . . . . . -> A = 1010 B = 1010 A and B are the same location A = A XOR B -> A = 0000 B = 0000 B = A XOR B -> A = 0000 B = 0000 A = A XOR B -> A = 0000 B = 0000 Zero is injected

The array of encoding values is slowly filled with zeroes, and this happens in exactly the same sequence during encoding and decoding, so the encoding slowly goes to plaintext, and continues to code and decode “correctly”.

B-Con • June 21, 2008 6:10 PM

Thanks for the explanation, Andrew. I was looking for the problem in the wrong place…

Clive Robinson • June 21, 2008 9:23 PM

@ Andrew Garland, B-Con,

Your initial lack of understanding indicates that perhaps you are part of a younger generation of programers that have been fed “high level” programing…

If you have read some of my posts to this blog in the past you will know that I am of the opinion that “software engineer” is a false statment and that “software artisan” or “code cutter” is more appropriate for modern Comp Sci grads.

The reason for this is that it used to be the case that engineering subjects where taught from the base fundamentals upwards. And that programing was (pre 90s) likewise taught from the bottom up.

That is knowledge of basic logic, direct and indirect pointers, structures, single and doubly link lists, circular buffers, fence post problems and passing by ref / val and their merits that once where the staple of Prog101.

However the need to “bring on” programers now means that Prog101 is at best about partial knowledge of abstract ideas such as objects, inheritance, overloading and code reuse but primarily about how to drive the latest IDE.

The result is that the knowledge of your average graduate Comp Sci programmer is extreamly brittle and usually non portable from IDE to IDE let alone from programing language to language.

The reason is that your “code cutter” graduate has been trained in the latest CV “must have” language and it’s marketable features of the IDE and other support tools,not the fundementals. Presumably under preasure from industry or politicos (both true in the UK).

Thus the journyman “code cutter” lacks the ability to “stand back” from something that is new and radicaly different and relate it to what they know simply because they cannot take what they know back to the fundamentals and build it back up again in the new “methodology”.

A further and realy significant issue is that they are all but usless at debuging below the “compiler warning” level.

I have stated that I would rather take on a physics maths or electronics graduate that has done a couple of years “real time” assembler than a comp science graduate with 5+ years in a high level language, and not even interview them if they only had “object” language skills on their CV.

Unfortunatly during the 90s the rot seemed to set in to other engineering subjects such as electronics marine and aviation engineering and laterly civil and process engineering subjects where familiarity of the latest CAD/CAM tools not engineering fundimentals was the educational priority.

Subsiquently the rot has also spread downwards to things like primary school mathmatics where it is more important to show the correct use of the calculator or computer than basic addition or subtraction, and a child will be lucky to be shown how to do basic longhand multiplication let alone division with fractional numbers and differing signs.

Part of this is that teachers do not actually have knowledge of the subject over and above the books they teach from (I recently found out that the maths teacher at my sons school did not know about the rule of nines or other mental math short cuts and had never heard of Napiers bones or a slide rule).

Worse I know of quite a few Comp Sci graduates who cannot show why the floating point multiplier in a general purpose processor and a digital signal processor work in exactly the same way (ie integer multiplier and it is just the way you select which output bits to use) just because they have never been shown how to multiply numbers long hand. And worse have no knowledge of the difference between logical and arithmetic shifts and why you would use them.

And appalingly when asked to do simple mental math some engineering graduates do not apear to understand how to deal with numbers with different signs.

Oh and the most ignoble claim to fame that most modern Comp Sci graduates appear to suffer from is explaining how you do subtraction by addition only (look up ones / twos complement). And why the side effects are so important to the likes of computer security.

I guess that is why the more fun programing jobs go to graduates from other subjects such as physics maths and to a lesser extent electronics.

Any way my appologies for being so long winded on this but the problem realy gets under my skin.

Andrew Garland • June 22, 2008 9:49 AM

@ Clive Robinson

Hiring by the Even Numbers

When I was hiring for system development programming jobs, I asked 10 questions about programming subjects, like how does the debugger work, and what is the language description (BNF) for arithmetic expressions limited to addition and multiplication. Any CompSci grad should know these. Most didn’t.

I wanted the interviews to end on a good note, so I made the last question one that I thought everyone could answer: How would you determine if an integer is even? If the person gave one answer, I asked for other methods also. This can be an interesting subject, getting lightly into binary representations, operator side effects, and modulus arithmetic.

To my amazement, this question turned out to be an “indicator”. People who knew everything else could also discuss evenness. People who could not discuss evenness didn’t know anything else either! I was tempted to ask just the evenness question alone, as the whole test.

jiistme • June 22, 2008 4:56 PM

@Clive & Andrew

Man…I suck! Many years ago, I could’ve probably answered Andrew’s questions. (My first programming teacher in college was very good.) But, as time has gone by and the only skill-set I have needed is the type that Clive mentioned as being taught today, apparently I have forgotten all of the truly cool and exciting parts of computer programming. Heck, I couldn’t even remember ones and twos complement when Clive mentioned it. I remember the terminology, but very little about why it is important.

It is an odd thing too, as much as I would love to get into a job where I was forced to keep up my knowledge on the complexities of computer programming, I love the people I work with and so it is not worth going to a new job. That being said, it seems that it may be a good idea to actively put some time in at home to re-learning the basics of computers since it seems that I have forgotten alot of it.

David Robarts • June 22, 2008 8:37 PM

Maybe I’ve just had to much real math education and I’m just too interested in computers, but the evenness question seems dead simple to me. If you look at the ones bit for the integer you should be able to determine that the number is even (ones bit set to zero) or odd (ones bit set to one).

Andrew Garland • June 22, 2008 9:52 PM

@ David Robarts

Yes, looking at the low order bit can tell you if the number is even.
() Write the C statement that does this.
() Does it matter if the low order bit is on the left or right on your machine?
() What other ways can you think of for doing this?
() How would you use modulus arithmetic to do this?
() How would you do this without bit matching or mod arithmetic, using other features of C?

I didn’t ask these questions first, and I didn’t ask all of them if the candidate was having trouble. I expanded this inquiry into evenness as an experiment, to verify that good candidates could answer all of them. Weak candidates might answer 0 or 1 of them.

I was surprised that most people who had trouble in the first part of the interview didn’t know how to determine evenness, in any language specific way.

They usually did not know about checking the 1’s bit, and didn’t know how to do this in the C language.

It seems dead simple to me too. That is why I picked it as an exit question. But, it wasn’t simple enough.

The programming that we were doing required a feeling for math and algorithms.

GordonS • June 23, 2008 6:26 AM

@David Robarts

You’re complicating things 🙂

x % 2 == 0

GordonS • June 23, 2008 6:51 AM

@Clive Robinson

I’m afraid that I have to disagree with your stance on modern computing degrees (at least here in the UK).

I’m studying a degree in computing (at 27 years old, a little later than most; I’ve been working as a programmer for years though). I started programming on a Commodore 64 at the age of 8 or so, before moving on to object Pascal, then C, C++, PHP and finally C#. And I make no apologies for working with a high-level language like C#!

Anyway, my course started off on the hardware side of things, then went onto logic. We’ve learned about all the things you say we haven’t; logic, pointers, structures, link lists, circular buffers, passing by reference / value etc… And we’re learning in an abstract fashion, learning about concepts that are portable between platforms and languages. There has been a large focus on the maths behind computing too.

You also seem to think there is a focus on the IDE more than anything else – again, that has not been the case in my experience. Indeed, it’s quite the opposite. We use a few different IDEs, and are not allowed to use form design tools – essentially the IDE is just useful bcause you get syntax highlighting, built-in debugging etc.

I’m doing the degree through the Open University, but I did start another degree with the RGU in Aberdeen several years ago that was teaching much the same stuff, in much the same manner.

I think you’re taking a bit of a dinosaur’s look at things 🙂 I certainly think complaining that a primary school teacher doesn’t know what a slide rule is is a bit silly (although I do take your point).

However, after arguing against you points I will concede that it is to find decent high-level language programmers. In my experience, most don’t know there arse from their elbow. I liked Andrew Garland’s ‘Hiring by Even Numbers’ approach – such simple problems are useful for quickly weeding out the time wasters 🙂

Roger • June 23, 2008 7:09 AM

@Davi Ottenheimer:
“There’s something odd about the tone of a site that suggests people need to compete in order to make bad code look innocent.”

At present, people are certainly out there creating this sort of stuff maliciously, in secret, and we do not have systematic methods of detecting it (e.g. the 2003 Linux 2.6 kernel attack was detected very quickly because the attackers only modified the CVS tree, which therefore differed from the authoritative BitKeeper version. However the Debian OpenSSL bug persisted for TWO YEARS before it was detected; I don’t claim that Kurt Roeckx acted maliciously with this “bug fix”, but that’s just what a covert backdoor in open source code would look like, and we had a lot of trouble finding it.)

It seems to me that by holding an open competition, we (that is, the information security community) will learn more about the sort of techniques that can be used to hide malicious code in open source, and thereby become better at detecting it.

Clive Robinson • June 23, 2008 7:56 AM

@ GordonS,

I’m well aware of just how different the Open University is as a pioneer of distance learning it is substantialy different from attending Universities.

Of attending universities if you go to Camb/Oxford and quite a few of the “original unis” then they derive a substantial part of their income from research grants and their teaching is aimed at those who are going on to do research. They would see “dumbing down” courses as a disaster and they see it as an absolute must to maintain their reputation as “research” institutions.

The newer Unis (Old Polys etc) have a very different financial model which is “bums on seats” and vanila teaching, and they derive very little of their income from research funding. They need a high turnover of students or other non Government derived income to survive.

As I noted the Open University is in an almost uniques position, it trys to maintain a very high accademic teaching standard similar to that of the Old Unis for all students as well as doing a disproportianatly large amount of research for a University that is bassed substantialy around distance learning.

However at the end of the day the majority of places on CompSci degrees in the U.K. are in the “new Unis”. That as I said above derive little or none of their income from research grants and do not aim to attract the cream of students and only realy offer a qualification. Often their finances are (of necesity) topped up by either forign students or courses sponsored by major employers who obviously have a say on course content. As part of the process they need to “Market” to attract in students and often the only difference they offer is what “buzz word” teaching they do.

I fully expect this situation to get worse as Tony Blairs dream of getting 50% of people degrees continues. Basically all but a few Unis will need to pander to the whims of those providing them with their income…

JimFive • June 23, 2008 3:40 PM

@Clive
I can only think of 3 ways to tell evenness…

x & 1 = 1
x % 2 = 0
and
x << sizeof(x)*8 = 0

I’m guessing this last one is what you were looking for with your last question.

JimFive

Clive Robinson • June 24, 2008 8:05 AM

@ JimFive

It was not actualy me that posed the evenness question but Andrew Garland,

And Jim if you gave me your first answer of,

x & 1

I would ask you,

What happens if the number to be tested is negative?

Judging by the blank / puzzled / scared looks I have seen 80-90% of candidates would fail without a further prompt of,

If the CPU is a 1’s complement machine?

If you answer correctly without the prompt it tells me you have actually done some low level programing, with the prompt it tells me you are aware of one of the limitations of C and if not (back to your mainstream MS code-cutter shop 😉

When you get close to the metal C is not architecture nutral and the assumption of behaviour only holds true for unsigned integers. Having worked on PDPs and some other odd architectures K&R where well aware of this which is why they left it open.

You are now thinking something like,

“In reality what are your chances of meeting a one’s complement machine these days it must be close to zero with general purpose CPUs.”

However when you get to custom CPUs made from FPGA’s etc to do specific transforms efficiently (in say signal processing or graphics) you do stand quite a chance of bumping into non 2’s complement behaviour.

The reason being that most A-Ds output for zero volts is actualy 0x10… or 0x7F…, 0xF… is the maximum positive value and 0x0… is the maximum negitive value, and D-As output voltage in the same way.

For speed and efficiency the extra operation to normalise to 2’s complement is just not justified, as the overhead becomes excesive…

Likewise the “grey code twiddle” needed for some DSP algs etc might not work with the negative number representation used on your CPU.

“Real Time Hardware” and “Device Driver” programers realy are a breed apart from the norm, and often do a big chunk of their work in assembler to get the I/O or Interupt response times to acceptable levels.

Final question hold your hands up the distance light travels in 1/1GHz (app 30cms or a foot is close enough 8) which is one of the real limits on how you can design electronics and therefore computers (Moore’s Law is just an assumption 8)

Andrew Garland • June 24, 2008 6:31 PM

You see what happens when trying to ask a simple question. It may even be harder to answer a simple question, because the answer(s) are easy to understand, than a complex question which gives complexity to hide in.

@ Clive Robinson
I didn’t consider 1’s complement machines, but it complicates the problem nicely. It eliminates using bit tests because they are not transportable.

@ JimFive
x & 1 = 1
(x & 1) == 1 or (x & 1) as a truth value works, for both positive and negative numbers. The parens are needed because == has higher precedence than &.
This works on 2’s complement machines only, as pointed out by Clive Robinson.

x % 2 = 0 [you meant x % 2 == 0] works transportably

x << sizeof(x)8 = 0 [you meant x << sizeof(x)8 == 0]
I think this always gives zero, because it shifts out everything.

What I meant by “using other features of C?” was something like
2*(j/2) == j where j is an integer. The idea was to see if the candidate could construct the mod function if he didn’t have it available. But, I think I put it badly here.

This discussion shows how complicated software becomes if one chooses the wrong basic function to use. In this case, modulus arithmetic “%” is the savior.

Clive Robinson • June 25, 2008 1:42 AM

@ Andrew Garland,

“It may even be harder to answer a simple question”

Definatly, not just for the reason you stated, think in degrees of freedom…

A difficult question is either drilling down for a fairly specific answer or the person asking the question may not know enough to answer the question simply (a failing I have seen at many interviews which an applicant might well take advantage of to spin you a line).

A deliberatly simple question however “is a licence to roam” in that it should test the bredth of knowledge and more importantly give an insite into the applicants reasoning process. Hints are helpfull to both the applicant and you especialy as it enables you to shepard the discussion in the direction you want it to go.

To use an analagy, A pencil is a simple tool, a CAD machine is not. The pencil can be used to draw in many ways you might not have thought of (ie shading by cross hatch / dots / dashes / smudging). The CAD machine only gives a limited set of options that the designer included and only a very very limited way to access any particular feature.

When interviewing people I have a very simple process. I ask an individual applicant about what is on their CV that I have knowledge about, and two simple questions that I ask all the applicants. The CV questions are realy only to alow them to relax and talk about what they have done and give me an idea of their style of answering and warm them up for the two questions .

The first question will be speciffic to the domain, such as the Evenness question above which gives them an oportunity to show their stuff and their reasoning ability within the domain (drill down question). It will have one or two simple answers you would expect all applicants to know and many other ways of achieving the same objective (but you the questioner realy must be very very conversant with the subject).

The second question is to test their reasoning ability and it has little or nothing to do with the domain it intentionaly has no right answer. It needs to be very general and not have limiting social overtones (politics, faith or current events etc) and might be as follows (but please make your own otherwise it will not be as effective),

“You have an important meeting with a client in the morning. When you get in your car and turn the key the engine does not turn over. What do you do next?”

This is usually so open and unexpected that it should give quite an insite into their general way of thinking. Oh and if they answer immediatly then either they have heard the question before or they are a sales person 😉

Seriously expect a slightly puzzeled or initialy blank look or even a slightly unfocused start and ask them other gentle questions to get them into the swing of it. The results are usually worth more than the rest of the interview.

Finaly I always give the applicant feedback on how they did in general. It is probably the most difficult part of the interview process. I do it for a couple of reasons firstly because if I have had a negative feeling about them for some reason it’s an oportunity to sort it out, secondly I have sat there as an applicant myself and I have always appreciated the feedback good or bad on the few occasions it has been given.

The easiest way is to go back to their CV and very gently go through the good and bad bits with them again.

Chris K • July 17, 2008 6:35 AM

This isn’t new. “Hidden Keys to Software Break-ins and Unauthorised Entry” by Dmitry Sklyarov p93-96 describes exactly this mistake in an implementation of RC4 in ADA. He refers to the cryptogarphy-digest mail list for February 2001.

Nong • July 27, 2008 9:37 AM

Dear Professional,

I have data 001,010,011,101,110,111. How can I encode them into 3 – 6 bit and can decode back for these 6 values (001,010,011,101,110,111)?

Any one can help me?

zzzz • March 20, 2009 10:45 PM

There are other ways to isolate the lowest bit without using and.

For example… (x – ((x>>1)+(x>>1)) will return 0 for even numbers and 1 for odd. On 2’s complement machines it works for negative numbers too… on 1’s complement machines you might have to exchange the meanings when x is negative (but I’ve never seen such a machine… everything made in the last 30 years is 2’s complement).

ROM • April 21, 2009 3:56 AM

One of the more interesting discussions I have seen in weeks. 🙂