Microsoft Bans Memcopy()

This seems smart:

Microsoft plans to formally banish the popular programming function that’s been responsible for an untold number of security vulnerabilities over the years, not just in Windows but in countless other applications based on the C language. Effective later this year, Microsoft will add memcpy(), CopyMemory(), and RtlCopyMemory() to its list of function calls banned under its secure development lifecycle.

Here’s the list of banned function calls. This doesn’t help secure legacy code, of course, but you have to start somewhere.

Tags: Microsoft, operating systems, Windows

Posted on May 20, 2009 at 6:17 AM • 85 Comments

Comments

Clive Robinson • May 20, 2009 9:15 AM

As discussed a few days ago on this blog there are better ways to go these days than with C/C++.

Unfortunatly after something like 40years the assumptions about the resources are the other way around these days (ie Hardware=cheep Humans=expensive).

We need to think of a suitable high level language without the faults but still retaining the power and compactness.

Oh and please please thing up a better way to delcare variables (it’s a night mare hence so many pointers to void, and the risks ascociated).

I think both K&R have said they would change the precedence and ascociative behaviour in C…

Oh and perhaps we should stop thinking of “files” being the basic data/comms model I’m sure there are better ways to do it.

But hey Rome was not built in a day 8)

JRM • May 20, 2009 10:15 AM

It’s not all that smart. The “safe” version just includes an extra length parameter (of the target buffer). If the programmer knows both sizes, a good program will already check, and a bad program will just pass the same size as both lengths, or define a lazy workaround:

#define memcpy(dst, src, len) memcpy_s(dst, len, src, len)

Since the programmer now has to check the return value of memcpy_s() (instead of the sizes of the buffer and copy), there’s no saving in error-handling code.

David • May 20, 2009 10:17 AM

If there are significant numbers of pointers to void in your C++ code, either you’re doing low-level stuff (which cannot effectively be done using a “safe” language), or you’re doing it wrong. It’s perfectly possible to write good, secure C++ code (and very easy to write the other sort, true).

I wouldn’t want to write a user app in C, though. There’s too much that can go wrong, and there are languages that are more expressive and safer (C++, C#, Java, Lisp…).

That being said, I don’t see what’s so bad with memcpy(). The issue with sprintf(), sscanf(), gets(), strcpy(), and others like them, is that there is no particular reason why they couldn’t overwrite all sorts of memory if something unexpected happened. With memcpy(), you specify the number of bytes to move, much like snprintf(), fgets(), and strncpy(). There are a lot of C functions that should be banned from use, but memcpy() sure doesn’t look like one of them to me.

Roboticus • May 20, 2009 10:29 AM

In Writing Secure Code For Windows Vista, Michael Howard and David LeBlanc (Microsoft security dudes) say “if we see the function call used in three or more security bugs, we ban that function for new code.” If it took this long for memcpy to be used in 3 bugs in Microsoft code that is actually amazing.

Jack • May 20, 2009 10:30 AM

I’m guessing this is not a philosophical list, but a “That is the fifteenth thousandth security exploit involving memcopy, it’s going on the list” list.

Todd Knarr • May 20, 2009 10:42 AM

The big problem with the prohibition is that isn’t not memcpy() that’s the problem. The problem is programmers who don’t think about buffer lengths while writing code, who simply don’t bother doing the work to check sizes correctly. For instance, when passing a destination buffer into a function, rather than going “Oh, I don’t know it’s size, I need to change the API to pass that in so my function can check whether the destination’s big enough.”, they go “I don’t know the size, so it’s the caller’s responsibility to pass in a big enough buffer. Gee, that made my job easier…”. Banning memcpy() won’t help, because programmers like that will continue to find ways to not think about that no matter what you do. You’ll either do nothing to help the problem, or you’ll make the code with the problem in it more obscure and harder to spot.

Trying to idiot-proof a system’s a losing proposition, because the universe keeps coming up with better idiots.

Marcos • May 20, 2009 10:44 AM

I also fail to see the problem of memcpy, or, more specificaly, I see how it can be missused (using strlen as the length argument), but the person that does such a thing will create vunerabilities with any API on any language you care to name.

Also, why is sprintf on that list? Should people only use asprintf?

Marc B. • May 20, 2009 10:49 AM

Stupid, stupid, stupid.

memcpy_s throws exceptions right and left, if the parameters are not checked thoroughly. So you have the same issue as with memcpy – you are responsible to sanitize all the input into your routines.

It is completely irrelevant which function you use.

Adam • May 20, 2009 10:51 AM

Marcos,

Sprintf is on the list because of format string issues.

Anyone looking for more information on where to start should take a look at our Security Development Lifecycle pages (microsoft.com/sdl) and the optimization model, which is all about how to get started.

Because Bruce is right–you have to start somewhere, and this is a step along the way, and part of how our SDL is evolving. It’s not one thing to do in isolation.

Ben Combee • May 20, 2009 10:58 AM

sprintf is best replaced by snprintf which takes a buffer length so you can’t write past the end of the destination buffer. asprintf is also useful if you don’t know what your max length will be and you’re OK with allocating memory.

Tim • May 20, 2009 11:05 AM

“If there are significant numbers of pointers to void in your C++ code, either you’re doing low-level stuff (which cannot effectively be done using a “safe” language), or you’re doing it wrong.”

Exactly. C++ is a great language. D looks good too. Would be nice if there was a byte code version of C++ though to avoid the whole recompilation issue. Java is just evil (seriously who designed StringBuilder?). Haven’t tried C# though. Is it any less evil than Java?

Rev Matt • May 20, 2009 11:06 AM

Programmers are writing bad code, rather than provide adequate training or fix the peer review process let’s ban the use of this function. So the developers don’t learn how to write secure code, but instead write insecure code using something other than memcopy. Brilliant nonsolution.

kangaroo • May 20, 2009 11:14 AM

I’ve never seen you this wrong before, Bruce. The problem is programmers that are working above their pay-grade — no “process” is going to fix that, other than hiring better folks, paying them more, and working them less.

It’s particularly idiotic to go via a statistical analysis to decide what are “evil” functions. The misuse of functions don’t point to a problem in the function — they either point to a poor choice in methods (innappropriate programming language, review process, …) or poor choice in employees.

Papering it over with some “ban” is only making the problem worse — it’s “programming theater” that puts a problem under the carpet, rather than a solution in any way. “You have to start somewhere” is just a cry of despair — no, you have to start in the right place, starting “somewhere” is just marketing.

Nathan • May 20, 2009 11:14 AM

I’m with Todd. This is just syntactic sugar, plain and simple. This is a perfectly valid and reasonable implementation of memcpy_s(), right?

#define memcpy_s(a,b,c,d) memcpy(a,c,min(b,d))

This is not going to help the programmers who simply fail to think about what will happen when their code actually runs, in any way. The only thing this change has to do with security is the false sense of such.

David • May 20, 2009 11:17 AM

/Opens disk editor. “Why are my afp passwords appearing in unused regions of this word document? Didn’t I turn that in as homework, only to have it posted on the course website?”

David • May 20, 2009 11:20 AM

Isn’t this sort of a programming variation of an “Are you sure?” window popping up? The first time it pops up, perhaps the programmer will think about checking the sizes of the buffers. After the 4th time, they’ll be making the same mistakes, and after the 8th time, they won’t even be conscious that they’re typing memcpy_s(target,len,source,len) instead of memcpy(target,source,len).

Nathan • May 20, 2009 11:24 AM

I am rinsing my schadenfreude mug, preparing for the inevitable exploit caused by misuse of memcpy_s(). Boy, will that be delicious.

Curt • May 20, 2009 11:25 AM

I don’t understand people who believe that only incompetent programmers can make mistakes.

Nathan • May 20, 2009 11:27 AM

Curt, no one believes that only incompetent programmers make mistakes. It’s just that no one sees how use of memcpy_s() will make any difference in the likelihood of a mistake, regardless of the competence of a programmer.

FP • May 20, 2009 11:28 AM

What annoys me most is that Microsoft mandates the use of a new, proprietary “memcpy_s” function. If only they had introduced it into any standard (like ISO C) first.

It’s trivial to work around but still a burden for those who want to write portable code. Embrace and extend!

Nathan • May 20, 2009 11:30 AM

Allow me to clarify:

If a hypothetical programmer is so incompetent that he is simply unable to use memcpy() properly, then memcpy_s() will not help him.

If another one, highly talented and detail-oriented, nevertheless occasionally makes mistakes using memcpy(), then it seems he is just as likely to make the same mistake using memcpy_s().

Disagree?

Mark R • May 20, 2009 11:31 AM

I like the dangling participle in this sentence from the Reg article:

“He likened memcpy() to other risky functions such as strcpy() and strcat(), which Microsoft has already banned after exacting untold misery over the years.”

That’s either careless writing or a very cleverly veiled shot at Microsoft…

TD • May 20, 2009 11:40 AM

Let’s not confuse people by appending ‘_s’ to the names of ‘safe’ functions, a suffix commonly used in the naming of structs.

David • May 20, 2009 11:40 AM

@Jack: The problem is that, if the list is based on the number of times a function is listed in security reports, and not philosophy, it’s not very useful.

What we need to do, for security purposes, is identify functions that can be replaced by potentially safer equivalents. For example, you can’t control how much memory will be overwritten by gets(), but you can with fgets(). Nobody can write safe code that uses gets(), and it should never be used in anything remotely safety-critical.

However, there’s no practical difference between memcpy() and memcpy_s(). A good programmer will be precisely as safe with both. A bad programmer will be precisely as dangerous with both.

Banning memcpy() because it shows up in a predefined number of security reports is much like banning liquids from aircraft: it’s mindless, inconvenient (far more so for the airline passengers, of course), and useful only as security theater. Banning gets() is like hardening the cockpit: it is clear that it will stop a certain class of threat, and is useful as security.

Adrian • May 20, 2009 11:42 AM

Several comments have pointed out that banning dangerous functions isn’t a complete solution. True, but it is part of defense in depth.

Programmers should be better trained to understand what their code actually does and to recognize potential pitfalls.

Compilers and analysis tools should check for buffer overruns and other common errors.

Testing should be more comprehensive.

And dangerous functions should be eliminated or replaced, just in case the other three layers fail.

Just because I know how to use a table saw properly doesn’t mean I want one that’s lacking safety guards.

Curt • May 20, 2009 11:44 AM

A well-intententioned programmer that supplies appropriately obtained values to memcpy_s is less likely to make an error than one who supplies no-less-appropriate values to memcpy_s. The extra value is a check on the mental model of the programmer that also transmits additional information to code reviewers to help them reason about the code. Naturally, someone who passes the same value to both parameters will defeat this check — but even this can be easily found in a code review.

Curt • May 20, 2009 11:44 AM

Oops… the second “memcpy_s” in my first sentence should have been “memcpy”.

unsurprised • May 20, 2009 11:45 AM

I’m tempted to start a contest in the IOCCC vein where programmers are challenged to create the most insecure code with the (supposedly) most secure languages. For example, creating a buffer overflow exploit in Java not by writing past the end of an array, but by allowing part of the array to be used for things like security parameters, in which overwriting, say, the last part of the array overwrites the permissions bits. Appending zeros to your input gives you root access, all without a buffer overflow!

This is exactly the wrong approach: a programmer who creates a security hole by using memcpy will create security problems regardless of the language used. Banning the API only frustrates the programmer who has a legitimate use for it, e.g., copying a video buffer during a video decode.

M Welinder • May 20, 2009 11:51 AM

The list includes strlen.

Curt • May 20, 2009 11:51 AM

There’s some pretty compelling research which shows that the simple use of checklists in hospitals can have a dramatic effect on mortality. Think of this extra parameter as the equivalent of an item on a checklist.

Clive Robinson • May 20, 2009 11:55 AM

As discussed a few days ago on this blog there are better ways to go these days than with C/C++.

Unfortunatly after something like 40years the assumptions about the resources are the other way around these days (ie Hardware=cheep Humans=expensive).

We need to think of a suitable high level language without the faults but still retaining the power and compactness.

Oh and please please thing up a better way to delcare variables (it’s a night mare hence so many pointers to void, and the risks ascociated).

I think both K&R have said they would change the precedence and ascociative behaviour in C…

Oh and perhaps we should stop thinking of “files” being the basic data/comms model I’m sure there are better ways to do it.

But hey Rome was not built in a day 8)

Nathan • May 20, 2009 12:02 PM

Here’s what I think is a relatively common use of memcpy():

char *buf = malloc(src_size);
assert(buf);
memcpy(buf, src, src_size);

When arguing about the very theoretical benefit of replacing memcpy() with memcpy_s(), consider the wastefulness of comparing src_size to itself in this use case, which I suspect accounts for the vast majority of memcpy() uses.

David • May 20, 2009 12:06 PM

@Curt: While overcasual use of memcpy_s() is going to be obvious in a code review, so’s bad use of memcpy(). I also don’t think it looks much like a checklist. A pilot or nurse using a checklist is performing a major procedure, unlike a programmer using memcpy().

@M Welinder: strlen() won’t write over anything, but can try to access a vast amount of memory, and if this causes page faults it can take a significant amount of time. Call it a potential performance bug.

What struck me as odd on the list was strncpy() and strncat(), until I read the rationale. They aren’t just length-safe equivalents of strcpy() and strcat(), but functions with behavior that’s very like their non-n equivalents, but different enough to be tricky. I assume that strcpy_s() is pretty much a length-safe variant of strcpy().

So, the only ban I don’t understand is that of memcpy(). I would expect that, on the statistical basis, they’ll have to ban memcpy_s() down the road, and replace it with memcpy_s_s(), which will require the copy length to be entered three times, and so on ad infinitum.

Nathan • May 20, 2009 12:17 PM

David wins the thread.

John • May 20, 2009 12:24 PM

“Here’s what I think is a relatively common use of memcpy():

char *buf = malloc(src_size);
assert(buf);
memcpy(buf, src, src_size);
”

I most certaintly hope that isn’t a common use as written. Because it is begging for security issues. The problem is the assert(). Namely, once the code is compiler for production release, it goes away.

So you have a system that’s equivalent to wearing a safety vest while the ship is docked in port, but once it heads off into deep water, the vest is discarded.

Bryan Feir • May 20, 2009 12:29 PM

@M Welinder:

Actually, I have seen strlen() involved in a security bug. (And yes, it was my fault at the time: it was known that the buffer could be full, so I did strlen() then capped it at the buffer length. Wrong order of operations: I should have capped the length first and then called a limited search function such as memchr().)

Basically, the implementation of strlen(src) in the system library we were using for an embedded processor just did memchr(src, 0x00, 0xffff)-src. Not only was there a problem with a bus error if src was less than 64K from the end of memory and there was no NUL in that space, but if no NUL was found in 64K worth of search and memchr() returned NULL itself, strlen() could return some very strange values.

Gweihir • May 20, 2009 12:31 PM

Sorry, but this is not smart at all. The problem is not the function call, but the ones using it an unsafe manner. As long as there is direct memory access in C, this type of mistake can be made. However the direct memory access cannot be removed, C is still needed as assembler replacement. There are only two ways to fix the problem:

Have only people with understanding of software security design systemsn and write code.
Go to a language without direct memory access.

Option 2. has the drawback that this does only shift the issue to more complex vuknerabilities. ”Programmers” that do not understand security will screw it up, no matter the language. And there is a host of security problems that have nothing to do with direct memory access. I agree that the number of buffer overflows still found in software today is pathetic, but the issue really is the people writing this software. They are underqualified, undertalended, possibly underpaid, likely under too much time pressure and overestimate their own skills. Many of them will not even see the problem or feel clever that their code is so elegant, it does not need any additional checks…

M • May 20, 2009 12:32 PM

There is a simple solution to prevent programmers from doing memcpy_s(target,len,source,len).

A precompilation checker could verify that the second and fourth parameters are not the same. Isn’t this the type of work prefast is supposed to do? I wouldn’t be surprised to see some checks added to prefast to put some sanity checks around the use of memcpy_s.

The use of the more explicit function causes programmers to put at least a modicum of thought into buffer lengths. Adding static syntax checking on top of that can catch the most egregious abuses of this function. It won’t be too terribly difficult to make it harder to use the function wrong than to use it properly.

JRM • May 20, 2009 12:54 PM

@M:

You can’t pre-check, both because aliasing in C/C++ makes it impossible in general, but also because it makes the programmer have to try trick the checker in the perfectly normal case mentioned above where the destination buffer is explicitly sized for the source length:

void * dst = malloc(len);
if (dst) {
if (memcpy_s(dst, len, src, len) != 0) …

Should I make a new variable just so I can copy len into it so the checker won’t complain? And if that is such an easy way to defeat the checker, won’t people trying to cheat just do that in their memcpy_s()-avoidance macros?

You already have to be thinking about buffer sizes when using memcpy(); it’s part of the API. Adding a second length does not make thinking about it any more likely or not thinking about it any safer.

Gweihir • May 20, 2009 12:59 PM

@M: That will not help. The problem is thet some (many) people do not understand why memcpy() is dangerous. Those that do understand it will need no such check. Those that do not understand it will just work around the check. Example: Use a wrapper:

memcpy_s_works(a,b,c,d) { memcpy_s(a,b,c,d);}

Done. The pre-check gets impossible without dataflow analysis. Such “clever” tricks are bound to be discussed among the incompetent even now…

Garrett G • May 20, 2009 1:02 PM

In a world w/o security cure-alls, this is obviously a good move.

Gweihir • May 20, 2009 1:29 PM

@Garrett G: There is nothing “obvious” in secure software implementation….

Dave • May 20, 2009 1:49 PM

Strikes me more as extend and embrace. Code written under Linux is now harder to port to Windows. Code written under Windows is now harder to port to Macs.

In the end, it’s a lot of grief that doesn’t really solve anything. Exactly what I’ve come to expect from large corporations.

Bob • May 20, 2009 2:32 PM

Not surprising. Microsoft started this a few years ago. I had occasion to update some device driver code from Visual Studio 6 to Visual Studio .NET 2003, and lo and behold, things didn’t compile. They’d disabled a whole bunch of standard C library functions like this that they thought were unsafe. Thanks, Microsoft, I wasted quite a lot of time figuring out how to re-enable them so that perfectly safe code would compile.

Mind you, I’m all for making secure coding easier and insecure coding harder. And specifically, strncpy’s behavior is an abomination that frequently leads to unterminated strings and it should be deprecated from the language standard.

But my experience is the same as what a lot of other people have said. Security-minded programmers can use all those “unsafe” functions safely, and incompetent programmers won’t use anything safely. So I don’t find this as exciting as Bruce does, or expect it to immediately lead to better software.

However, I do things like this will make insecure programming difficult enough that it will encourage incompetent programmers to become more skilled at their craft.

Bob • May 20, 2009 2:35 PM

Oh… and I do wish safer calls were part of the language standard.

Dave is right on the money – Microsoft’s replacements for these unsafe calls are not portable.

I tend to use the safe replacements from GLib on Linux and Mac OS, myself. But I doubt Microsoft will get on board with that.

Patrick Stein • May 20, 2009 3:41 PM

I’ve got to say that I’m puzzled by this. Faced with this, any programmer would immediately write the aforementioned macros or write his/her own my_memcpy().

Copying things is essential to most any program (and definitely to any C/C++ program). You have to copy. Maybe one could just ban passing arguments into functions? or allocating arrays on the stack? or pointers?

People are going to copy. People are going to create bugs when copying. Even if they use memcpy_s() explicitly (as opposed to via macros), they will botch it.

A program that uses memcpy_s() poorly will be just about as bad as a program that uses memcpy() poorly.

joel8360 • May 20, 2009 3:49 PM

@Adrian:

If, in spite of all the safety measure, someone loses three fingers in three table saw accidents, will they be safer if you take away their table saw and give them a radial arm saw instead?

I’m all for defense in depth, but this particular measure sounds like trying secure a house by erecting a steel door in the middle of the sidewalk.

Nostromo • May 20, 2009 4:20 PM

This is Microsoft, so it has nothing to do with better programming and everything to do with persuading people to write non-portable, Microsoft-only code.

For example, they don’t advocate using strncpy, the “safe” version of strcpy, because – wait for it – it’s “so hard to call correctly”. Bollocks; if you call it with the integer argument first insted of last, a decent compiler will detect that and warn you.

This is yet another attempt to promote Microsoft-only lock-in, and should not have been given any publicity.

icf7 • May 20, 2009 4:25 PM

Also note that people might be tempted to

#define ypcmem(dst, len, src) {size_t i = len; while (i–) *dst++ = *src++;}

Bonus points for declaring i as int.

David • May 20, 2009 4:31 PM

@Nostromo: It isn’t that strncpy() is hard to write or call, but that it isn’t quite “the ‘safe’ version of strcpy”.

strcpy(), when called with something that isn’t a string, or too long, will overwrite memory, but if it doesn’t mess up too badly will leave a null-terminated string. strncpy(), under those circumstances, copies what it can, but doesn’t leave a null-terminated string. This will very likely cause some problems in later processing. It isn’t as error-prone as strcpy(), but it’s still easier to mess up with it than it should be.

Jeff Dege • May 20, 2009 5:11 PM

“This requirement was put in place to prevent use of certain older C runtime functions that lead to buffer overrun flaws and have been deprecated.”

By whom? I can see a good argument that these functions should be deprecated, but if that’s going to happen, it’s going to have to be the ANSI/ISO standards committees.

Microsoft doesn’t own the language standard, and has no authority to declare any function deprecated, no matter what they pretend.

Dolphin • May 20, 2009 5:28 PM

@Nostromo

The main problem with strncpy is that it will happily leave you with an unterminated string, leading to lots of:
strncpy(buffer, string, bufferLen);
buffer[bufferLen-1]=0;

Not to mention, it seems somewhat wasteful to always pad out the whole buffer with 0’s as strncpy does.

Gweihir • May 20, 2009 7:14 PM

Well, I have take to using strdup() or strndup(). The second also ensures that the target string is 0 terminated, and allocating a new string is often what you want anyways. If not, you should have a very clear idea about the target buffer size and enforce it.

Lawrence D'Oliveiro • May 20, 2009 7:53 PM

Interesting how so many of the functions have to do with null-terminated strings. Instead of picking off these functions one by one, why pussyfoot around the issue? Why not just come out and say “ban null-terminated strings”? That would be so much simpler.

Anonymous • May 20, 2009 9:00 PM

So if the programmer knows the destination buffer length, they’ll replace it with memcpy_s… but then they probably already know that the source will fit.

If they don’t know the destination buffer length, will they rewrite whatever portions of the code are necessary to expose it, or will they write
for(int i = 0; i < src_size; i++) ((char)dst)[i] = ((char)src)[i];
?

Not that deluded • May 21, 2009 12:44 AM

@Dave
“Code written under Linux is now harder to port to Windows. Code written under Windows is now harder to port to Macs.
In the end, it’s a lot of grief that doesn’t really solve anything. ”

As if the replacements are hard to write. If it takes one competent programmer a week of concerted effort, including tests and publishing it to sourceforge.com, I’d be shocked. No, it’s very little grief that doesn’t really solve anything.

Besides, if memcpy() is banned, determined programmers will just write their own function, only not call it memcpy(). It’ll be safe_copy(), or bpsc() for Bob’s Perfectly Safe Copy, or something equally misleading. If they don’t have detailed code audits, it’ll all pass through the bozo filter.

Nostromo • May 21, 2009 12:50 AM

@David, @Dolphin:

You’re nitpicking at a side-point. The main thing to understand about this whole discussion is that it is yet another attempt by Microsoft to promote non-portable, Microsoft-specific code. I’m appalled that Bruce helped it along by giving it publicity.

Somebody has a proposal to make C programming less error-prone? Fine, discuss it in an open forum, get a bunch of independent compiler producers on board, hammer out weaknesses with the whole C community, then maybe propose it as an ANSI standard. That’s if they REALLY want to make C programming less error-prone. If they just want to con programmers into writing stuff that will work only on their platform, then they’ll do what Microsoft has done.

Anonymous • May 21, 2009 3:54 AM

Don’t fall for M$’s “Embrace, extend and extinguish”.
Switch to Cyclone for “low” Level / Drivers:
http://cyclone.thelanguage.org
http://en.wikipedia.org/wiki/Cyclone_(programming_language)

And use Eiffel for all the High Level / Software Construction stuff.
http://en.wikipedia.org/wiki/Eiffel_(programming_language)

@Clive Robinson:
And Bjarne Stroustrup knows that too:

Stroustrup: “In addition to Simula67, my favorite language at the time was Algol68. I think that “Algol68 with Classes” would have been a better language than “C with Classes.” However, it would have been stillborn.”
http://www.gotw.ca/publications/c_family_interview.htm

The C++ Fanboy-crowd just doesn’t want to realize that.

Ricky • May 21, 2009 6:06 AM

I think “banish the popular programming functions that’s been responsible for an untold number of security vulnerabilities” is good action which is taken by Microsoft. But I think the problem with the prohibition is that isn’t not memcpy(). The problem is programmers who don’t think about buffer lengths while writing code, who simply don’t bother doing the work to check sizes correctly.

Chris • May 21, 2009 6:16 AM

A lot of you seem to be asserting that bad programmers will screw up anyway, but on what grounds? Sure you can say memcpy_s(dst, len, src, len), but you could also perfectly habitually say f(src, x) { memcpy_s(dst, dst_len, src, x) }. It’s combining two things which should always go together (checking your buffer sizes and doing the copy,) which seems like it might promote a good habit.

Robert the Red • May 21, 2009 7:27 AM

What is really needed in C is a way to check if accessing a memory location is legitimate — without a segfault. Then it would be possible to write safe code.

drscrooge • May 21, 2009 8:06 AM

i think the real solution to this problem is to use structs which are length + buffer instead of a plain buffer and have all functions that modify the buffer check internally to ensure overflow doesn’t happen. for example qmail uses its own string library and i don’t think they’ve had any overflow bugs.

David • May 21, 2009 9:02 AM

@Lawrence: I agree that null-terminated strings are a bad idea. Unfortunately, we’re stuck with them for writing in C and, to some extent, C++ (C++ has the std::string class that is a whole lot safer and more useful, although it can’t eliminate all use of null-terminated strings), and we aren’t going to replace C and C++ at all easily. What we can do is think of ways to deal with them more safely.

@Robert: You can’t check if a memory location is legitimate. Given cooperation on the part of the OS and hardware, it would be possible to check if the memory location was listed as OK to write to, but that doesn’t mean it’s really legitimate to write to. We’d need to check if accessing a memory location is legitimate for the operation we’re doing, since the big problem is usually writing to other data rather than segfaulting.

Dolphin • May 21, 2009 9:38 AM

@nostromo

I’ll give you that it is a side point, but it is hardly nit picking. strncpy is not really much of an improvement on strcpy. yes it protectes the WRITE from going off the end of a buffer. Unfortunately you have to do extra work to make sure that a subsequent READ of that string doesn’t read off the end of the array. Of course, MS could have used strlcpy. It doesn’t fit their naming conventions and the return value is different (better IMO) but it is going to be available on more systems than strncpy_s.

I think the main problem is that MS decided to enable the warnings for the original functions by default. MS is not forcing anyone to use the new functions, just throwing out a bunch of annoying warnings (that you have to opt out of on a per project basis).

In the end though, I do think memcpy_s is a bit on the silly side. And of course, if you want to man functions that are involved in a large number of security flaws, you should look at WinMain (oh gasp, that is not portable!!!!)

JimFive • May 21, 2009 1:24 PM

@David
RE: “You can’t check if a memory location is legitimate.”

If a more secure C language is desired then I think you could do this. You wouldn’t be able to determine if the memory location is correct, but the compiler could determine if the memory access was legitmate, for certain values of legitimate. The compiler would need to check two things: Is the memory allocated? and Is the memory being accessed through a variable for which it was allocated.

The second condition is harder to check (especially for pointers into arrays/lists etc) but it should be at least checkable at some debug level.

JimFive

David • May 21, 2009 1:44 PM

@JimFive: I don’t think it’s possible to test, for reasonable C programs, that memory location X is supposed to be reachable through variable Y. It would be possible to test if X and Y were in the same allocation, but that could get real slow.

If we were to design a new language, we could define a pointer as a base pointer plus current offset with maximum offset attached, while we were ditching null-terminated strings. It’d use memory, and would slow down programs due to increased size and hence cache misses, but it would be safer.

JimFive • May 21, 2009 2:49 PM

@David
It would be possible to determine that the memory accessible by a pointer was allocated as a unit. This would be, effectively, the same thing as determining if an access is legitimate for that variable (handle). You could also prevent pointer incrementing from moving outside of the allocated space in the same manner.

I recognize that there would be difficulties to overcome but that’s true in any language development.

JimFive

Peter E Retep • May 21, 2009 5:30 PM

The MSD banned notice sp terrified the firewall keepers at LAUSD,
that they blocked access to the list of banned commands –
so no one could find out what they are!
Perhaps Bruce needs to cover a new subject,
Security Abuse, defined as actions where people attempt to impose the illusion of security
by terrorizing or using the rationale of terrorizing
instead of providing any actual security?

Anonymous • May 22, 2009 11:16 AM

A statistical approach to finding areas of code that are likely to produce security flaws makes sense. I’m not sure that they actually checked the number of false-positives, however. Yes, a significant number of bugs are related to non-trivial overwrites of memory, and a significant number of non-trivial overwrites of memory are done using memcpy. But does the effort of switching to memcpy_s (doing it portably, the effort is small but non-zero) outweigh other means of spotting the same problems?

moo • May 22, 2009 1:57 PM

As a long-time C and C++ programmer, I agree with other posters above that banning memcpy is pointless and will solve nothing.

Most people disable the “deprecated” warnings Microsoft compilers issue about standard library functions anyway, and go right on using them. I’m not interested in Microsoft’s embrace-and-extend functions with the _s on the end. But even if I was… strcpy_s or sprintf_s might have some utility in preventing a buffer overflow; memcpy_s is totally redundant because YOU ALREADY PASS A LENGTH PARAMETER TO MEMCPY and if you calculated that length wrong then you are *screwed* and asking you to pass it twice (or pass two different length parameters) is not going to change a goddamn thing.

In addition—when I think about most of my uses of memcpy, they are copying partial buffers around, not entire fixed-size buffers. I use it for copying unaligned data to buffers that are properly aligned for them, and things like that. In many cases there is no correct value to pass as the new “buffer size” parameter for memcpy_s, and the programmer will end up passing the same value for both size arguments, and the function will end up doing unnecessary checking with absolutely zero effect on program safety.

Wim L • May 23, 2009 12:49 PM

JimFive: I think that would still only cover a fraction of the problems caused by memory overwrites. If you’re going to implement the amount of type-bookkeeping that it would require, I think you’d be better off moving to a typesafe language. Strong typing doesn’t have to be painful to program with; I’m thinking of languages with relatively complete type systems and type inference (ML being the canonical example). This will catch most of these errors at compile time, which is worlds better than any runtime checking.

On the one hand, we do need “unsafe” languages like C for low-level work. On the other hand, most of the code written in the world isn’t doing low-level work.

Kevin Nickerson • May 23, 2009 3:15 PM

Bah. If they wanted to improve C programs they’d A) Get a copy of PCLint and fix the headers supplied by MS for C/C++ to the top warning levels and B) Change all the ‘safe’ functions that take ‘the number of elements’ and make them take the size in bytes so we can use sizeof without needing a magic divide.

I must admit I hadn’t realized they have a strcpy_s, we made a str_copy function 15 years ago that works like that. Probably the single biggest bug reducer we’ve ever done other than insisting on a clean lint.

Clive Robinson • May 23, 2009 4:25 PM

@ Wim L

“This will catch most of these errors at compile time, which is worlds better than any runtime checking.”

Unfortunatly compile time checking can not catch most clases of malicious code nor does it check code that is to be linked at run time.

“On the one hand, we do need “unsafe” languages like C for low-level work.”

Actually you don’t need unsafe languages at all for doing low-level work. It is one of those “here be dragons” myths. The problem invariably in my experiance is the programmers not the lanquage they use.

“On the other hand, most of the code written in the world isn’t doing low-level work.”

True it is not but even with “safe this” and “safe that” laguages programers still manage to make a compleat pigs ear of it.

I now get on my old and venerable hobby horse and point it towards the nearest windmill 😉

I would rather employ a persone who writes software “in ascociation” with their vocation rather than a person for whom programing is their vocation.

Electronic Hardware and mechanical Engineers usually have little difficulty in writing safe code in assembler that will run in embedded systems for years without issues.

Likewise physisists and mathmaticians likewise appear to have little difficulty writing fairly robust and reliable code

Those few I have met with chemical, biological or medical training likewise have few problems. Ive even seen excelent code from librarians.

What is it about computer science training that sucks? Or is it those drawn to the subject that suck?

I have developed compleate embeded solutions which have ended up in millions of consumer products and have yet to here of any real problemswith systems I designed a quater of a century ago that are still doing their thing out ther 24×365.25 come rain snow and hurricane (and worse).

I’m not a genius few engineers are (we’d be scientists if we where 8). We just appear to have a much clearer perspective of what is needed and how to acheive it reliably.

And before you ask yes I work close to the metal most of the time and still use WordStar style editors. Line for line I cann’t touch those “formular One” “code cutters” for shear output, but unlike them I tend not to have bugs to fix…

Call me the tortoise if you will but I still beat a lot of hairs, especialy those indicative of the mad march ones.

Shane • May 27, 2009 10:32 AM

That good ole’ American sentiment: “It’s somehow related to problems that are occurring, let’s ban it!”

I’m not a fan, and with a decade of development experience in all mess of languages, sorry to say, it would be far more useful to simply fire the idiot programmers who don’t know WTF they’re doing.

Banning memcopy() sounds about as useful as banning water on an airplane because ‘terrorists have been known to drink water while flying, allowing them to hydrate themselves so they can complete their terrifying mission of terror’.

Let’s hope they don’t ban Assembly next… Otherwise compilers are gonna have a rough time, you know, compiling.

rpl • May 27, 2009 3:11 PM

==
Likewise physicists and mathematicians likewise appear to have little difficulty writing fairly robust and reliable code

Those few I have met with chemical, biological or medical training likewise have few problems. Ive even seen excellent code from librarians.

Oh, come on, now. I’ve seen a lot of scientific codes too, and the quality is all over the map. I’ve seen thousand-line do loops, incomprehensible control flow, every variable a global, functions that do completely different operations depending on a flag setting; you name it. And hardly anyone checks return values because they “just know” their library calls aren’t going to fail. Some scientific code is pretty good, but a lot of it is pretty bad, and some is horrible. Pretty much the same as any other discipline, in other words.

I believe science code tends to have fewer egregious failures than general-purpose software because it typically has a much easier job: do the following sequence of calculations a billion times and write the answer to a file. Input is usually fairly simple and often rigidly formatted by another program, so there’s little need to validate it or check its length. There is no event handling and little, if any, dependence on state from outside the program. There is, in short, little that can go wrong, and so scientific programmers can get away with making assumptions that would completely sink, for example, a network stack.

So, while I don’t doubt there is plenty of room for improvement in the discipline of programming, physicists’ and biologists’ code provides no evidence one way or the other on that score.

Anonymous • May 27, 2009 4:14 PM

@ rpl,

I think you missed the point that my comments where with respect to those writing in low level languages for embeded and Hard RT systems, used in demanding real world aplications such as flight control and engine/breaking managment systems.

Not general purpose slop code to hack a quick set of results to enable other activities.

Or three dimensional desktop widgets or office productivity code.

The simple fact is that in my experiance, those who’s chosen vocation was not computer science but engineering or hard science tend to have a more focused aproach to what they are doing.

Especially when it comes to not so obvious and difficult issues that cannot be “abstracted” away to the OS to deal with via threads and IPC…

When it comes to developing Hard RT and top half device drivers, you need an eye for minutia, it realy is function over form every time and it has to meet the micro/nano second time limits each and every time.

Those who’s vocation is ordinary computer science appear to jump in their education from basic while loops to multithred objects without any understanding of 60years of sweat and toil in between that have ,ade those objects not just possible but usable in human time scales.

Thankfully for most they never need look behind the illusion that is multithreaded objects.

Hardware and OS performance has improved so much that it has alowed such abstraction to appear relativly responsive in human terms.

But unfortunatly they are like any abstraction, an illusion, and in this case purchased by the wastefull expenditure of resources.

They most definatly are non functional in terms of the time it takes light to cross the palm of your hand, or realisticaly the time it takes a jet aircraft to fly it’s own length.

And it is in this arena of Hard RT that I realy do see the difference between software artisans with their cosy “patterns” and “object reuse” and engineers and scientists with the cold harsh unforgiving real world to deal with.

Afterall you don’t want the ABS on your car to stop working while your passenger is navigating the menus to adjust the aircon do you?

Mike Becker • June 15, 2009 9:27 AM

This is the illusion of security. I can still overrun the buffer if I pass in bad data, and I can use this API to pull extra data from source into destination if I don’t check for it. A wiz-bang spiffy new API doesn’t make code secure, good engineers do.

Given:

#define SIZE 10

//
// Globals
//
unsigned char Buffer [SIZE];
char password[8];

void
GiveUserSomeData (unsigned char UserBuffer, size_t BufferSize)
{
errno_t Code;

//
// Count the characters for the user
//
for (int count = 0; count < SIZE; count ++){
if ( Buffer[count] == 0 ){
Break;
}
}

//
// Now count will be the length of the Buffer, give it back to the user
//
Code = memcpy_s( UserBuffer, BufferSize, Buffer, count);

if (Code == 0){
//
// Yay! I am a security guru!
//
}
else{
//
// Good job memcpy_s() for saving me!
//
}
}

Problem 1 – leaking data:

Assume: Buffer[] has no null characters in it (oops)

unsigned char MyBuffer[12];

GiveUserSomeData (MyBuffer, 12);
//
// Question, what does MyBuffer[10] contain?
//

Problem 2 – Overrun:

#define SIZE_1 1
#define SIZE_2 2
#define SIZE_3 4
#define SIZE_4 8
#define SIZE_5 16
#define SIZE_6 32

unsigned char MyBuffer[SIZE_4];

//
// Other code here…
//
GiveUserSomeData (MyBuffer, SIZE_5)

//
// This one’s easy to catch, but not all are
//

Tom Crispin • June 15, 2009 11:10 AM

How is banning memcpy() different from the security theater at airports?

Paul • June 15, 2009 12:47 PM

I can hardly wait until Microsoft programmers start generating more than three security bugs with the _s functions and thus require Microsoft to deprecated them as well.

giraffe • June 16, 2009 10:19 AM

Hey, Microsoft, isn’t it true, that memmove(dest, src, len) left, but memcpy was banned?
And what about bcopy(src, dst, len)? Was it already banned?
Or let’s just hope, that windows programmers don’t know such functions… Tss… don’t tell them!

martinr • June 16, 2009 2:29 PM

Requiring programmers to replace memcpy() with memcpy_s() might reduce the amount of buffer overflows, but it does not fix the underlying programming error!

The serious problem with most of Microsoft’s proposed securelib functions is that they suggest silent truncation is a good idea. It’s not. It is just a different variety of the same bug that is less obvious to exploit.

Currently, integer overflow vulnerabilities are on the rise, which are often silent truncation errors (for integers, truncation is usually a modulo operation either during a calculation or when storing the result).

Every piece of code that is looking at data or copying data ought to contain sanity/plausibility checks instead of bogus assumptions. If you have a short buffer, and the source is to long to fit, the programmer must decide consciously what to do, and whether silent truncation and continuing without error is a sensible thing to do. In many situations it is an extremely unreasonable thing to perform silent truncation.

Good code needs >70% of its statements for error handling — the actual algorithm for the good case is often the “easy/trivial” part of the task.

And then, Microsoft fubar’ed snprintf() in their C Runtime, which is why they had to add snprintf_s(). That way Microsoft is giving snprintf() a bad reputation.

Personally, I think that securelib functions are a dumb idea. For traditional code it is easy to distinguish good from bad code in C — the good code contains seperate length checks.

When code uses memcpy_s() instead, you no longer can distinguish new good code doing conscious silent truncation from bad code that is just adhering to a policy.

Only when the good code needs to do sensible error handling, then the explicit check will still be present — and the extra parameter on memcpy_s() will amout to pure waste, and at worst become a consistency problem for future changes.

Training programmers to make conscious decisions about oversized data or integers not fitting into available containers would be a good idea, reduce the number of bugs and significantly improve the overall situation. Hitting programmers with the policy bat is a pretty bad idea.

Requiring programmer to change code without requiring them to actually fix existing or potential bugs is also a bad idea. A securelib policy is likely going to convert many buffer overflow bugs into silent truncation bugs only while giving the false impression that the bugs were fixed.

Address space randomization is an acceptable countermeasure to temporarily make the exploitation of buffer overflows more difficult — no code is changed, so noone gets confused about bugs that were not really fixed, and since no source code is changed, no new bugs are introduced.

mark5009 • June 16, 2009 6:03 PM

@Tom C

It isn’t, and it conveniently ignores the underlying issue(s). IMO, this is the “Curse of C”. As K&R (and others) have pointed out, C was meant as a portable assembler for a PDP-8(11), not as a general, commercial language. There were much better languages for that back in the day.

But Unix was begat, and universities got Unix for zip, and C was taught and promulgated, and the curse spread. And the folks came to believe that C was the language of choice for everything, forgetting the warning that it was “caveat emptor” and, like any assembler, most difficult stuff was left in the hands of the coder. Very necessary when you are writing device drivers and the like. Less helpful for your average accounting reporter.

The solution is to lose C and take a better option, of which there are many.

mark • June 25, 2009 1:01 PM

Forget about fixing a few calls, let’s replace C++, C#, etc. with Ada and fix all these pesky coding problems. Sharp edges are dangerous. Hence the avoidance of “pointers”.

martinr • July 17, 2009 10:33 AM

I’m really sad to see so many people promote the million monkey approach to software development.

Personally, I do prefer to be a good programmer, technical expert and improve my expertise constantly, rather than being a simple-minded code monkey with mediocre skills whose code would be completely unusable if there weren’t stringent programming languages and tools.

Although I personally definitely no fan of pair programming, I think the other decision of Microsoft — to perform development of Windows 7 with half the Manpower and double the quality is in principle a very wise idea.

And if you manage to combine the expertise of two such developers into one person, then it is the scheme that I’ve been successfully doing during the last 15 years with mission critical business software.

Expertise also significantly helps in recognizing bugs (malfunctions rather than vulnerabilities) of others in code or documentation (of which there are plenty) and devise workarounds.

For may years there have been tools like purify to locate many of the buffer overflow errors (ABR array bounds write) plus a lot of others, and using that tools to improve ones skills is a really good training. Requiring users to use so-called “modern” languages takes those training possibilities away from the programmer, and will reinforce their mediocricy.

A 4-wheel car is a car that breaks down where no towing car can get to rescue it.

Kris • July 16, 2010 2:51 AM

Why don’t you people just replace your brains with something? Try bird food for a change!
I can’t believe I am unable to compile a stupid DLL written in C because the god darned linker spits out something like “undefined reference to _snprintf”! Have you ever tried to understand the concept of “standard”? Hell, my code is written in standard C & a so-called C compiler can’t get it to binary form…

Why don’t you just ban your lives!!

Schneier on Security