Eric Raymond, in The Cathedral and the Bazaar, famously wrote
This is a companion discussion topic for the original entry at http://blog.codinghorror.com/given-enough-money-all-bugs-are-shallow/
Eric Raymond, in The Cathedral and the Bazaar, famously wrote
This issue is not a new issue. The same thing happened many years ago with closed source software. I remember arguing vehemently and unsuccessfully against DLL’s as they required an implicit trust in the published interface. Every thing else was a black box hidden behind the façade.
Of course I was totally unsuccessful in my efforts. Where I was wrong was in my assumption that had I had the ability to follow the code behind the façade and step through the black box that I could have validated the box. In reality even in the late 80’s the vast majority of bugs I would have found would have been my implementation bugs - not bugs in the boxes.
Similarly as you have commented the capacity to evaluate and read code in specialist areas is
Now we have a huge field of candidates who are experts at spending their time in areas that include all of the three groups above. I refer to computer studies students at the universities throughout the whole world. Why not make their third year elective project have the option to do an in depth analysis of a small part of critical code routines in the public domain.
They can be marked on the rigour of the process - with top grade marks going to the effectiveness of the analysis - not whether they found a bug or not.
I was a little surprised to not see a mention of efforts like Project Zero in this post, because that may represent a possible solution. Rather than just blindly throwing money at bounties, having large corporations who depend on these projects forming dedicated teams to tackle the security of open source projects seems like it solves many of the issues you raised:
Of course it requires more companies to realize just how much they rely on the security of these projects, and the turn that into a commitment to improve that security.
If your focus is the payout, who is paying more? The good guys, or the bad guys?
I suspect that the bad guys have always been willing to pay. If the good guys weren’t paying, the bad guys would always win this test by default! Therefore the fact that the good guys are increasingly turning toward bug bounty programs is a good thing in this context.
Money makes security bugs go underground. There’s now a price associated with exploits…
Isn’t this the other way around? Money for security bugs has been around almost as long as software. Hackers have always been willing to pay, directly or indirectly, for illegitimate access to computer systems. It’s just that now the good guys are also willing to pay.
There are two related problems.
The first is this is Economics 101. Look up “Public Goods and Externalities”. An externality is like pollution where I cause everyone in the city a fraction of a penny in damage by releasing smoke or soot. No individual is damaged enough to make it worth the time to recover. A public good is something where any person or group doing the work will benefit the public, not only them, and there is no way to prevent it. An example is a radio broadcast of a song, or not horribly locked down software. Copyrights and patents are an artifice trying to prevent information and ideas from becoming public goods.
Billions use OpenSSL. I was there when it was SSLeay. If each contributed a fraction of a penny, it could be made secure. But they won’t. They will hope Google or Apple or someone else will write one big check. Linux has a patronage system - RedHat, Ubuntu, and the Linux Foundation all have managed to monitize things - because they need the kernel to be better, it is easier to sponsor code artists to work than to try to do something closed. They benefit enough.
But the second problem is that many projects, if they don’t start at “The Quality Plateau” http://the-programmers-stone.com/the-original-talks/day-2-thinking-about-programming/ about 2/3 way down. If OpenSSL was refactored, many of these bugs would be found and disappear. (I’m making a good living cleaning up a project with lots of technical debt). It is possible to write almost bug-free code, but you have to approach it as trying to find the most elegant mathematical proof. 50 lines, not 500.
There are always sponsors to include new features - the new Crypto algorithms, formats, handshakes. But each time there is only enough to shoehorn and staple the feature in so that Technical debt increases.
Technical debt will be repaid, but this is where the “with enough money” to pay for eyeballs.
I should note refactoring to the quality plateau is itself an art and not all programmers can do it. Linus and team generally are, and they generally don’t let anything into the kernel that isn’t close.
Technical debt is easy to pay down AT THE BEGINNING, but like running up a bill on your credit card, the interest compounds exponentially. The small blob of spaghetti code is copied, metastasizes, and the tendrils of the spaghetti leak into and across the API (/* don’t call this with zero for parm 2 */).
The difference between the Linux Kernel and OpenSSL is that there is almost no Technical Debt in the kernel, and the little bit that is there is like a bill that fell between the cracks. OpenSSL from the beginnings in SSLeay started and continued to accumulate TD because it was complex and if something worked - met all the tests - no one was going to spend 2 days refactoring and cleaning.
Part of the problem with many Open Source projects is the atrocity known as the GNU buildsystem. autoconf, automake, and libtool. Which generates config.sh that takes 15 minutes to run to detect if strcpy exists on the current system, and fails and then you have to add --without-x and try another 15 minutes to run. It often takes longer to get the make started than to compile the whole. Linux is cross-platform, but in the right way. It builds its own host tools, and the idea is to make the code portable, not have thousands of detection options. Eric Raymond moved gpsd to Scons - and you can read his posts. The problem is it is stupid and futile and makes things ugly to try to support a 0.01% platform. I’m not sure if anyone has tried to get GTK to compile for a DEC20 or microVAX, but it still might with enough hacking. Or an old Sun Sparc system.
The code should be portable, and the build environment should not be one huge ugly hack to make a thousand variant stanzas of code run. Target the normal Posix (compile) - Linux, BSD, Cygwin, and friends, then add a glue.h where anyone with some strange platform can add routines or macros. Use GNU libc, don’t try to use Sun’s bzero. (Cygwin is a hack, but it is better to have a cygwin or mingw on windows than to try to make all the libraries and programs compile using direct native windows calls).
It’s going to be a big, slow project, but I believe that the whole software engineering culture is shifting towards more appreciation of reliability. At one extreme, there are now almost-production-ready projects like CompCert and seL4 that formally prove non-trivial correctness properties. At the other end of the spectrum there will always be half-baked code. But I really think that the whole ecosystem is shifting the reliability direction.
One option for bug finders is just to look for bugs in software that pays for bugs. There are a lot of new bug bounty programs out there: https://bugcrowd.com/list-of-bug-bounty-programs
I work as a receiver of bugs for a company security response team, and am starting to see a lot more requests for bounties that we ever did. Almost always, the reporters are doing a reasonably good job of reporting bugs, and working with the vendor. The amount of reports that are not actionable, or poorly crafted has slightly increased, but this is not a difficult problem, yet.
Vulnerability sharing as social activity:
The problem is bigger than this. It is two-fold, and one of the points has already been partially addressed:
Thus, the solution is really not all that hard:
Make the environment more receptive to “free-roaming auditors”, and they will come.
(1) Since I am apparently a ‘new user’, I cannot post more than two links. Here’s some more examples in non-link form:
EDIT: Just to make sure that my second point comes across well: it is not worth it for me to audit code for vulnerabilities, because it’s exceedingly likely to just lead to yet another shouting match, destroying my mood and focus for the rest of the day. I know enough other people who feel similarly about this - the gratification of contributing to a project simply isn’t there when security-related reports are not appreciated. This seems largely a problem with project maintainers who feel their ego is ‘hurt’ when somebody points out a security issue with their code.
The other problem is there is a lot of snake-oil or laetrile available.
The worst was the sales pitch for I think “Codility”. The example concerned a password field, but it would not check for NULL pointers so would segfault. Their demo fixed it by returning if the pointer was null. I was the only one in the room that said, “If you have a bug in a security area, the last thing you want to do is to make it silent. The only thing worse than a crash is a silent failure”.
Then there’s hungarian notation, making sure there are curly braces in the right place, proper useOfUppercase, and other nonsense. Sometimes it is useful, but most often it irrelevant.
C++ was supposed to save the world. Instead it brought bloat and resource leaks since you would have to often trace very deep paths to figure out what one line actually did - and the constructors could allocate any memory, spawn threads, who knows what. But it was “correct” as long as all these monsters, dragons, and ogres were kept hidden. “Why is my disk being accessed all the time?”. It is hard enough to avoid or debug a resource leak when all the code is right in your face. If it is scattered across an archipelago of dynamic objects, forget it. (I remember more than once someone suggested changing the Linux kernel from C to C++).
I would agree that “code is too big and not organized correctly”, but the problem is that the very same code can be written far more compactly and in modules with the proper “fracture lines”. Automated correctness provers will be a plague if it encourages writing 2000 lines and running it through one rather than 200 lines that can be eyeball reviewed.
And I would be even stronger about arbitrary projects. There is utterly no point in reviewing anything. Either they are clean and clear and cannot be refactored (except for a different tradeoff which would be immediately obvious), or they need all the strands of spaghetti removed, so it would only be worth it or not to properly and completely refactor such. Reviews can merely state the technical debt - oh, interest on your credit cards is taking up 80% of your monthly income and you are going to go bankrupt shortly.
This is definitely a big part of it. What proportion of security bugs are buffer overruns? And what proportion occur in code that could have more easily been written in a language that fundamentally disallows buffer overruns and has fully defined behaviour? It seems like this would cover almost all of them.
That applies to most desktop software - and is also a reason I strongly argue against using unmanaged languages unless there’s absolutely no other option - but in the case of web-based things, there’s an entirely different host of issues that you could run across. Both easily and generically preventable issues (such as SQL injections, just use parameterization) and issues that are not so easy to generically prevent (eg. CSRF attacks, as they require a modification of page contents and scripts).
I’m sure you could do a quantitative analysis on the public ones (e.g. the ones that have a CVE identifier).
User-supplied data being interpreted as instructions is just one of many fundamental security problems; the solutions to which are usually obvious. Want to stop buffer overflows from being exploitable? W^X prevents writable memory from being executable. Want to stop SQLi? Use prepared statements (in which instructions are sent in a separate packet from the data). Want to stop XSS? For PHP developers, we have HTMLPurifier (if you want to allow HTML); fundamentally, you want to escape all HTML special characters (especially quote characters).
Here’s the kicker though: These types of security vulnerabilities are but a slice of the pie. You also have to consider your operating environment (you can secure one application, sure, but what about the hundreds of daemons you have running? Are they all safe? Are they even up-to-date? etc.), application logic, physical security, network security (where cryptography is usually most relevant), and possibly the hardest topic in security: Human operators.
Even if every buffer overflow in every operating system were fixed tomorrow, we would still see data breaches.
I don’t know why security bugs are singled out here, it’s the same issue for all kinds of open source bugs. In my experience as someone who has contributed bug fixes to a bunch of open source programs the biggest obstacle is getting to the point were you have the app running in a debugger where you can set a breakpoint. Once you get to that point finding the issue is the fun and easy part. After you have the bug fix the second biggest obstacle is figuring out the process and the unwritten social conventions for submitting the patch. In my experience once you have the patch and figured out how to submit it the project’s developers will be happy to help you take it over the finish line.
I think open source projects should provide packaged up development environments with the app ready to be debugged and patch submitted. Maybe vbox images with the latest code, full build environments and patch submission as scripted as possible.
I can only speak for myself of course, but security bugs are by far the most harmful class of bug. Whereas a graphical bug is an inconvenience, and a data corruption bug means you’ll lose a few weeks trying to reconstruct your dataset, a security bug typically translates to irrevocable compromise of confidential data, violation of privacy, theft, or other types of serious impact.
Many of the same issues also exist for other classes of bugs. Fix the security bug handling, and everything else is going to be a lot easier to deal with as well.
Quoted for emphasis. If you can handle “Hi, your code is swiss cheese. I could fly a Boeing 747 through your authentication protocols (e.g. comparing MD5 hashes with the == operator, so any hash ~ /^0e[0-9]+$/ will match any other hash with the same pattern),” you can handle, “Doing this, this, then this makes the window ugly.”
@Adam_Pflug although there was no mention to that group, you do realize the “solution to many issues he raised” you posted are already there on the blog post ending?
Now, what I would like to see would be all this applied to small projects. Not just for security flaws, but to make at least all big bugs ubiquitous to the small dev. I’m not even sure if something like this already exists or not.
if (foo && foo.length > 0)
Unfortunately there are no warnings for such things. But it should be common knowledge, shouldn’t it?
There are many of those little nuances that if treated properly could greatly improve any piece of code and prevent bugs from happening, if only we had some kind of universal interpreter which would audit and warn us about all those little things.
Actually, I am willing to bet that the situation is many orders of magnitude worse than what you might imagine.
Formal methods allow to know where the bugs are. should I expand ?
Get ready to have them replaced by machine. No wait…
I am willign to bet that has being routinely done for at least 10 years