Schneier on Security
A blog covering security and security technology.
« Little People Hiding in Luggage |
| UK Two-Tier Tax Security System »
February 5, 2008
A Good Security Investment by DHS
They're paying for open source software to be scanned for security bugs, and then fixing them.
All the software scrutinized was found to have significant numbers of security flaws, Coverity said on Wednesday. Since 2006 the project has helped fix 7,826 open source flaws in 250 projects, out of 50 million lines of code scanned, the company said.
They find, on average, one security flaw per 1,000 lines of code. And when the flaw is fixed, everyone's security improves.
Posted on February 5, 2008 at 6:30 AM
• 42 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
now that's good tax dollars spent, imnsfho
It's hard to believe that a govt agency could be so reasonable and helpful, esp. the DHS. Who was responsible for this project? Whose idea was it?
It's almost like there's a DHS within the DHS.
To show the true paaranoid response this calls for,
"Oh my Gaud who's checking the changes are secure"
Hopefully somebody is looking at them for two real non paranoid reasons,
1) mistakes happen
2) is their work ok to be paid for.
As I'm not paying (being non U.S.) the first is of concern to me due to the law of "unintended consiquences". Sometimes a rational choice to inprove one aspect of a program has the unfortunate and unintended consiquence of weakening some other part...
If they find one security flaw per 1,000 lines of code while scanning 50 million lines, this should sum up to about 50,000 security flaws. So, where are the other 42,174? Still unfixed? Or do they count lines that belong to the same flaw only once in those "7,826 fixed flaws"?
@Clive: Yes, but this also holds for non-governmentally sponsored bug fixes. Even more for bug fixes from "anonymous sources", by the way.
Joe Jarzombek if I remember well...
Of course you realize, they leave the really juicy ones in, and log them for any future use.
Not to nitpick or anything, but... 50,000,000 / 1,000 != 7,826
1 bug per 1000 lines of code is often cited as the closed source industry standard, but it seems to me that at least the open source projects that the DHS have tested had only 1 bug per 6400 lines of code.
I'm one of the Python developers and subscribe to the -checkins list, so I watched the various fixes for the problems that Coverity reported. The fixes were made by several of the Python core developers, not Coverity, so the changes should be of equivalent quality to the rest of the code.
Most of the flagged items for Python were null pointer dereferences, often in error-handling code, and there were also a few incorrect return values and buffer overflows.
"Of course you realize, they leave the really juicy ones in, and log them for any future use."
Let's accept the hypothesis that they are. If they weren't offering fixes, they could *still* be "log[ing problems] for any future use". So the program is still good news, because a fix is a fix and still a gain (even if there are other issues that could be fixed but weren't).
And if they're not doing that, but have missed a lot of bugs, as seems likely ... well, a fix is a fix and still a gain (even if there are other issues that could be fixed but weren't).
A fix is a fix, and this is good news. Those who are determined to approach life seeing every glass as half empty -- even those that are not -- will just have to go away unsatisfied that there's been good news and the rest of us are pleased.
It seems to me that they would probably pay to fix bugs in closed-source code, too -- but the point is, because it's closed-source, they can't...
1 line in 1000 is exceptionally good.
Especially given the number of false positives generated by robotic bug scanners.
Things like using "strcpy" instead of the safer "strncpy" are highlighted as errors by these scanners -- which is good, but they fail to detect that code the code will executed only on "safe" data internal to the program and is therefore not a security risk.
The other major false positive is as mentioned before in error handling, especially in "bail out" routines where you are going to halt execution anyway so why fuss with memory clean ups and dereferences etc.
So the real figure is probably 1 in 10000 lines of code.
"And when the flaw is fixed, everyone's security improves."
Except Windows users. Will somebody please think of the Windows users?!!?
It looks like some of the commentors are missing some of the points:
* The government is paying for the scans, but NOT producing or suggesting any code fixes
* The developers are given the results of the scans, but don't have to act on them
Our tax dollars are being spent on identifying problems and the responsible parties are then provided with the opportunity to correct those problems as they see fit, within the parameters of their development efforts, at their own pace.
For those who are doing the complicated math, the articles indicate that ~7000 fixes were developed, NOT that only 7000 bugs were found. It is quite likely that the number of bugs found was about 50,000 (1 per 1000 lines of code)...and if that number of bugs was reduced to 42,000, then that is certainly better than having no bugs fixed.
Since the code scans cost the taxpayer $300,000, that equals $42 of taxpayer money for every vulnerability FIXED.
> Things like using "strcpy" instead of the
> safer "strncpy" are highlighted as errors
> by these scanners -- which is good, but
> they fail to detect that code the code will
> executed only on "safe" data internal to
> the program and is therefore not a
> security risk.
Until someone changes the routine around, or reuses it elsewhere, and it stops processing "safe" data and becomes a security vulnerability, of course.
What have you done with the real DHS?
And for their next trick, they'll make the trains run on time.
I'd like to have my own turn at clearing up some confusion.
The value of 1 bug per 1000 lines of code was an average across ALL open source projects. They also claimed that closed source software has a similar rate of bugs.
They then go on to list a number of projects with much lower than average numbers of bugs found by their methods. numbers like 1/10000, 1/20000 and 1/69223 (This project only had one found bug... but they haven't fixed it yet.)
This doesn't prove that open source software is more secure, but it does suggest that high-profile open source software is. The many-eyes-looking-for-bugs theory actually works. By inference, to balance the 1/1000 average, there must be plenty of open source projects with bug rates around the 1/100 lines of code and only three active developers.
For those of you worried about what has happened to the DHS; relax, this wasn't their idea, they are just the source of cash :-)
This started a few years ago when Coverity made their product available to some open source projects. FreeBSD was one of them and there were a number of bugs found as a result.
http://scan.coverity.com/ was announced two years ago with funding from DHS, they were looking at about 40 projects to start with. At the start of their second year they had 150 projects and they currently have over 250.
Fortify Software has a similar program where they will make their source code analyzer (SCA) tool available to open source Java projects. Pretty cool stuff - check out http://opensource.fortifysoftware.com/
@Chalmer: "Since the code scans cost the taxpayer $300,000, that equals $42 of taxpayer money for every vulnerability FIXED."
I'm not sure if you're arguing for or against this expenditure (or maybe it's just an observation), but $42 is about 1 developer-hour---less than that if you include business overhead.
That's cheap! Sure, most bugs found by automated scanning software will be straightforward to fix, but they're still bugs, and many of them are "security"-related bugs.
Since many of these projects are high-profile, Internet-facing programs (Apache, Linux kernel, NTP, Postfix, Perl, Python, PHP), an exploit for one of these bugs could cost just the government itself, not to mention the rest of the country, more than $300k in lost productivity.
This is a good idea, and the sort of thing that also gets sponsored by the government in CS systems research. That the DHS is also putting cash towards it is just some surprising foresight on someone's part.
"[the DHS,] they are just the source of cash :-)"
They're not the source. American taxpayers, some willing, most not, are the source of the cash.
The proactive fixing of security related defects is *always* cheaper than the reactive release of a security patch to fixe a day zero exploit was found in the wild.
Why isn't scanning used more? IMO? because its tedious work and most programmers, myself included, *hate* tedious work.
However, source code scanning is a invaluable technique that has existed since the earliest unix days, does anybody remember the lint utility?
It astonished me 20 years ago, and now, that lint and its ilk, aren't used more. Maybe its because some of those tools cost $$$ to acquire.
Actually, lint is being actively used and developed by at least one major open source project: OpenBSD
And it's helped them fix a lot of bugs in their code, still going too. The build it into the compilation of all libraries, and make it extremely easy to compile every single piece of source code to include lint. They are also one of the projects benefiting from the Coverty scans.
The coverity prevent static analysis tool scans for a whole host of possible defects. Only a small subset of that are actually security bugs. The one bug per 1000 is for the total number of defects. Not for security focused defects.
I use the tool regularly and find it to be fascinating. One of the little known uses is to identify bad code. Jeremy Allison has talked about this a little bit. But basically the tool is fantastic for exposing copy and paste defects (wow, same defect is in 8 parts of the code base). It also might find four defects in the same 100 lines. That code should be marked as red hot and in need of fixing.
Someone was surprised that DHS was sponsoring this. If I recall CERT is part of DHS now. They have always been forward thinkers. So if they were the ones driving this. I would not be surprised in the least.
Great tool. And I think that DHS doing this means that all software will get better. The commercial software makers will need to get this tool running just to compete in stability.
NSA was helping out on SELinux, to help security in general, until Microsoft started whining about unfair competition. We'll see how long this new effort lasts...
That is useful to know and somehow, I am not at all surprised, that its OpenBSD that you are citing.
They have a very proactive approach to the quality of their software.
I use Coverity at work, and it's terrific. It finds all kinds of bugs, not just security flaws.
Before I read Erik's remark, I'd already guessed that overall bug rate was 1 per 1000 lines, and the security flaws were a lot less frequent, and the press reports just weren't accurate. Not surprising. I've found technology reporters and press-release writers get such details mixed up a lot.
I don't know if this program is using it this way, but at work the normal practice is to run Coverity over our entire code stream several times a day. It keeps track of which bugs it finds are old and which are new. So if this program ran Coverity over these projects, say, once a week, or whenever they had a new release, it would be great, and you could just sit and watch the bug (and security flaw) count drop.
I have fixed several hundred of the issues they reported for the Linux kernel.
The Coverity scanner reports several classes of suspicious code like e.g. "dead code" (e.g. an "if(a)" where a compiler can prove that "a" will always be 0 at this point).
Some of their results point at bugs.
And some of these bugs are even security bugs.
But in many cases the solution was e.g. the removal of dead code.
And many of the actual bugs are stuff like small memory leaks in exotic (and not user triggerable) error paths.
The Coverity scanner is a helpful tool.
And there are many worse things the US government spends their taxpayers money for.
But I have to say you were fooled by false information in the PC World article you link to since Coverity itself knows why say talk in their press release only that it "has helped fix over 7,500 software defects" and not that all of them were security flaws.
'using "strcpy" instead of the safer "strncpy"'
If you're using C strings, using a fixed-size string copy function instead of strcpy is a good idea, but using strncpy for that purpose is not - "strncpy" was designed to be able to write into non-nul-terminated structures - it doesn't always write the nul.
Good observation about the fact that $42 per defect fixed is a decent bang for your buck.
I also like your point that this is especially important when we consider that these aren't just trivial software packages picked randomly, but software that holds a prominent place in not only government circles but in the backbone of the Internet and worldwide commerce.
I hope the software of companies with classified surveillance program contracts will benefit from the project—although we don't know which ones or how many are dependent on proprietary code, do we. Some of them are engaged in operations that I cannot entirely approve of, so why would I hope for a thing like that? Because I don't want the sensitive data they're amassing to be obtained by foreign governments, criminals, or mischief-makers.
According to articles written when the program was first announced (Jan 2006), the total grants were $1.24 million.
"The Homeland Security Department grant will be paid over a three-year period, with $841,276 going to Stanford, $297,000 to Coverity and $100,000 to Symantec...."
The $100,000 was to also test the Coverity software in a proprietary environment:
"Symantec will... test the source code analysis tool in its proprietary software environment...."
There should still be, at least, one more year of funded testing results provided.
You guys are completely missing the point. This is a list of bugs in open source software. Known bugs. Now when a person wants to justify purchasing 50,000 copies of closed-source software he will simply whip out this list, say "see open source has 50,000 known bugs, therefore closed source is the better choice" and buy what he originally wanted in order to get the free secret decoder ring in the box. Or whatever civil servants want as bad as a 5-year old wants a decoder ring.
This is a preemptive-CYA maneuver for the govt, not an attempt to get better software.
The actual number and severity of defects in any project is something of an open debate. All defects are not as clearly defined as a buffer overrun.
A perfect, bug free implementation of a flawed design is not going to be caught by a code scanner.
Also all lines of code are not created equal, comparisons across code bases are rarely accurate. Simply deciding to omit or include comments and whitespace in KLOC counts produces drastically different results. Not to mention what you choose to count as code: source? headers? resources? Installer scripts? Batch files?
@"Actual Source": if you want to get nit-picky about it (and evidently you do, or you would have kept your mouth shut), the source of the DHS funds is the government coffers, and the source of the government coffers is the taxpaying citizens, and the source of the taxpaying citizens is the business owners, and the source of the business owners is the consumers, and... here we seem to be stuck in a circle, but it we go far back enough to the root cause... the source is the government, without which we'd be making stuff and bartering it for other stuff.
The wealth, my friend. The source of the wealth. Not the source of the printed currency.
Seconding what others have said -- static code analysis is helpful (indeed, awesome) but won't catch lots of actual important bugs. It makes sense that much of what it would find is low-importance bugs/dead code in something as carefully scrubbed as the Linux kernel. Bugs obvious to Coverity are probably obvious to reviewers, too, or they're crashes that come out in testing.
Still, kudos to DHS for paying for something with a pretty good bang-to-buck ratio.
Static code analysis can probably do more if the code is annotated with more information by developers (e.g., the information that it would need to check for errors in reference-count bookkeeping, etc.). Larger boosts come from things that are even harder to build in after the fact, like rearchitecting to minimize the amount of privileged code,
borrowing pieces of the design from more-secure OSes (Orange Book level Bx), or even writing some stuff with languages or libraries that disallow common mistakes like overflowing buffers. The intense review and testing that the kernel team already uses is surely one of the best things they could do.
None of this is a recommendation to DHS or the kernel team; just pointing out that Coverity, while useful, is far from the be-all end-all.
To whom do we send our letters of gratitude towards. We're so quick (well not really) to complain when some government official works against our interests. Perhaps we should likewise send our appreciation and support as they might be able to use it as political capital in the future leading to even more beneficial actions.
It would surprise me if the real aim of DHS's effort isn't to introduce back doors, meant for its own use, into the likes of OpenPGP and TrueCrypt.
The lesson to be learned here is never to trust your critical data to any open-source product, unless you have the time, knowhow, and means to personally and minutely examine and analyze the product before installing it. Even if the founder of an open-source project is a good guy, he's unlikely to have checked up on any of his volunteers before accepting code from them.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.