Given Enough Money, All Bugs Are Shallow

cawas · April 5, 2015, 3:12pm

I find it far from clear who you’re talking to, @Nicolas3 … Me or Jeff.

If it was to me then, yeah, please do expand! No idea what you mean.

tomz · April 5, 2015, 4:02pm

Here is where it goes wrong. This at least isn’t laetrile, but it is morphine.

Because I’m not willing to fully understand the code, I need to use X that allows me to write stuff I can’t (due to complexity) or won’t (because of time or budget constraints) fully understand.

OK, use all these languages and tools that confine you to a straitjacket and keep you in a padded cell. You can still be a totally insane raving maniac. You can still write bad, problematic code. You will write such to the extend you don’t fully understand what it is doing.

Sometimes those coding condoms don’t work. It prevents you from explicitly causing an buffer overrun, but somewhere above or below in the stack one will exist and it will happily pass the logic bomb to a place it can explode. Then there’s the performance penalty. Or resource exhaustion.

People often don’t understand why I work hard to remove anything dynamic. For example “malloc”. Note that (at least an earlier) version of zlib decompression malloced LIFO. So I could just point to a blob and treat it as a stack (malloc became moving a memory pointer down by the size requested and returning the pre-request value, free became setting the pointer to the passed in value).

There is a fundamental difference in kind between removing the last malloc and using a language that buries some kind of (hopefully) safe malloc (assuming you can get the hardware to support it). Since I do a lot of embedded, usually there isn’t any fancy hardware. Sometimes there are even no mallocs. And the code bloat requires a much more expensive chip - consider if you sell 10 million of something, a one-dollar increase to get more speed, ram, or flash means $10,000,000 off the bottom line. Oh, and sometimes they can’t even be updated, or at least not easily (see what it would take to access the ECUs in your car).

Many would be horrified if I showed certain bits of code. The shortcuts, the type punning, the pointer aliasing. And yet it would take only a few seconds to see what I’m doing and why it works. So my 100 lines of C (which might fit in an Arduino) would do the same thing - and faster than someone practicing “safe coding” writing 1000 lines to do the same thing on a Raspberry PI model 2. I find it easier to explain every one of my 100 lines and 3 local variables, than the 1000 and 30.

http://twit.tv/show/coding-101/51 goes into it - in the problem Steve Gibson is at an electronics store and strikes up a conversation with someone who needs parts for his business. He complains that without their fancy CAD IDE they can’t do a trivial circuit. They don’t understand how the logic (much less the transistors and other components themselves) work.

How many “keep me from harming myself” programmers can even understand when to use a simple sort like insertion or bubble v.s. quicksort or bubble sort? Or do they just call the system (quicksort) to sort a 5 entry variable menu? Every time a single entry changes.

Remember every cycle takes a bit of energy. Drains the batter a bit. So another effect is my 100 lines will run for a month where the 1000 lines will run less than a week. So what if we will have bitcoin miner computers that take more amps than a refrigerator so that the bad brown code will respond quickly? Isn’t it better to have green code that runs on a solar powered tablet?

Security might be something gray and complex (but complexity is the enemy of security, hence my 100 lines are more secure, even if they are doing crypto an a 5GHz processor). But you can’t fool physics. Every state switch will take a few picowatts. A thousand switches will take a ten times what a hundred switches will. Refreshing 2G of RAM will take twice 1G.

So simplicity, minimalism, security, reliability, and even power consumption are all helped with the same approach.

DanielEarwicker · April 5, 2015, 4:56pm

Even if every buffer overflow in every operating system were fixed tomorrow, we would still see data breaches.

Yeah, but not as many.

prasun · April 5, 2015, 5:40pm

There has been a lot of research in academia about automated test generation, model checking, formal verification etc c

Some of these methods have been very promising:

SpecExplorer saved 50 person years of testing effort
1/3 of all Win7 WEX security bugs found by SAGE, SAGE is now used daily in Windows, Office, etc.

Why do you think these efforts haven’t caught attention in broader software development communities?

DanielEarwicker · April 5, 2015, 5:44pm

Since I do a lot of embedded, usually there isn’t any fancy hardware.

I’m not saying there should be a law passed that everyone must use Python.

Just saying look at the details of a lot of high-profile, very-wide-impact security flaws, and it turns out that they were doing plain old C with library-provided dynamic allocation, i.e. exactly what they’d avoid if they were actually using C for a good reason: to avoid wasting cycles and achieve static memory requirements.

But they used tons of fine-grained malloc/free calls (typically slower than allocating from high quality compacting GC) i.e. writing C code that mimics the behaviour of a dynamic language, and making no effort at all to squeeze out performance in the way you are concerned about. And then 20 years went by and nobody cared, so they were right about one thing: there was apparently no need to take care about micro-optimisations.

If by looking at the code you can see that they’ve ended up following Greenspun’s tenth rule, then clearly they weren’t taking any care to be super-efficient. So if that wasn’t an important concern, why did they think it was wise to wobble across the tightrope over the valley of undefined behaviour, when they could have just used the sturdy bridge and got the same performance with no buffer overruns?

biziclop · April 5, 2015, 6:22pm

Wow, this is just … bad. But how do we change people’s attitudes?

I believe this to be a difficult question because it might easily be the flip side of the same coin that makes people care about something they don’t get paid anything for: a sense of ownership.over code you’ve written. In moderate doses, that sense of ownership makes you put extra effort in, take pride in having clean and working code on your project, good documentation and so on.

Taken too far, that same sense of ownership will make you interpret every issue raised with your code as a malicious and personal attack.

Maybe making the knowledge of what security is about, what the dangers are and so on more accessible would help.

mountainstorm · April 5, 2015, 9:59pm

As a long time computer security professional; specialising in exploitation of code bugs, and I can say that Linus Law is crap. I see why he said it, but it doesn’t reflect the reality. For what it’s worth money doesn’t help either - the existence of open source basically proves that.

Here are some home truths:

Many eyes don’t mean jack - there are skilled eyes, and unskilled eyes. Never underestimate the ability for an excellent developer to not notice an integer overflow - even when its pointed out to them!
Finding bugs is easier if you have the source - otherwise you end up reversing the binary to get something like the source (I know, I’ve done it a lot) - all Open Source does is ‘lower the barrier to entry’
Open source ‘allows’ people to look, but the only people, with the right skills, that you can guarantee will look, are attackers
All software is massively buggy and bugs aren’t ‘exploitable’ OR ‘not exploitable’ - most bugs are exploitable given enough time and skill. We [the industry] used to think stack cookies would make exploitation impossible.
If you can’t write one secure/good implementation, what thinks writing another will help? See the Mythical Man month for a good discussion of 2nd and 3rd versions.
Good software engineering practice helps a lot. OpenSSL misses brackets and generally does tonnes of things ‘clever programmers’ do; just look at the horrid fix for Heartbleed. Except its not clever, it’s stupid. Good developers write code that people can read, quickly and concisely.

Money makes a difference in motivating skilled security engineers; and technically bug bounties should help. But in reality it doesn’t. You get crud reported ‘port 21 is open on ftp.mozilla.com’ and assessing these takes up the time of good security enginners - distracting them from real security work. If these people were any good they would be being paid for it already - truth is most ‘real’ security issues reported come from people in security roles (at other firms), finding them whilst doing other security work.

If you want a law which works, here goes; the Coops law of software security “it’s not money or eyes which make bugs shallow; it’s engineering effort - you need to pay for security with missing features”. Or to put it more bluntly - until your project has a filled out, and followed, threat model, and passes all freely available static analysis/compiler checks without error, you have at least one security engineer (with expertiese) who reviews high risk areas of code, and you do basic amounts of fuzzing of the attack surface … you have nothing; open or closed source.

Coops

joepie91 · April 5, 2015, 10:19pm

That’s a very good point. I think an important part of it would be to change the perception of pride, as it relates to software development. People shouldn’t be proud of writing code, but rather proud of maintaining code.

How to change that perception in that way on a large scale… that needs more thinking about.

voodooKobra · April 5, 2015, 10:22pm

I don’t quite agree with the bolded text. I don’t get paid (yet) for any of the research I’ve done.

joepie91 · April 6, 2015, 4:28am

If there is a fundamental architectural issue with the first implementation, then yes, building a rewritten implementation can be a perfectly valid thing to do. It means you do not have to deal with backwards compatibility.

chateau_io · April 6, 2015, 6:31am

Like to move in a slightly different direction, if I may, with a slightly different take on peer review. To me, it’s more about the reviewee, not so much the reviewer(s). Which is to say, review should involve both roles, that is, the coder should actively explain their code to the reviewer. As a coder, I really have to be part of the review.

The article and several responses note that the reviewing eyeballs aren’t necessarily experts in the problem domain, and certainly not in the particulars of the code under review. That expert is of course, the person who wrote that code. So it’s useful to turn the coder into an effective reviewer of their own work.

So I’ve learned over the years that the critical point of a code review, is that I have to be able to provide a coherent explanation of what I’ve added/changed/removed to the reviewer, such that I could convince myself that it makes sense. And often enough, I’ve found that the bit of code I’m explaining just doesn’t do what I meant it to ("… and here I’m checking that the input string isn’t empty…"), and as I say those words, I realize that bit of code is buggy.

Not to say that there’s no role for a completely independent reviewer, especially when lives depend on correct code, but the easiest & fastest way to find most bugs is to have the coder explain assumptions & code, and have both coder and reviewer(s) check assumptions & code interactively.

Not so easy in an open-source context, I agree. Can only suggest that a contributor should actively recruit reviewers, or that someone build a mechanism to facilitate that (e.g. let people register as reviewers so that a new commit results in a request for review).

xenoterracide0 · April 6, 2015, 1:35pm

Firstly on OpenSSL, I still believe that LibreSSL is the better solution, the OpenBSD guys have yet to do us wrong, OpenSSL however is now repeatedly showing up in security bugs. I’ve personally been wondering if LibreSSL is vulnerable to the recent SSLv3 exploits. I don’t think it is, because I think they removed SSLv3 support entirely.

We all have the same goal, more secure software

No, the NSA has the goal of having more exploited software. Most companies have the goal of making more money, they don’t take security problems seriously.

You know what I think the real problem is though? education, most programmers aren’t taught security how to mitigate security exploits. In job interviews you wouldn’t believe the number of programmers that can’t answer this question:

what is SQL Injection? how do you prevent it?

More than 50% of the people we interview can’t answer both parts of this correctly (second part I’m just looking for parameterized queries/framework)(heck I’m not even sure 50% get the first part right). When I ask about other security related stuff they’ve never even heard of it.

My point is, our web development books don’t teach to avoid SQL injection (let alone harder attacks like CSRF or Javascript Injection), or C books don’t teach to bounds check. How can we expect to improve security when we aren’t teaching people how to be secure?

tomz · April 6, 2015, 4:57pm

As I pointed out originally, making open source robust and secure is a “public good”. It happens with Linux because you have many sponsors with their own paid personnel submitting patches to the kernel, but the kernel is active. It has to get new drivers, new CPUs, new architectures.

I think it might only cost a few million dollars to “fix” the more static but important infrastructure projects like OpenSSL. Then issue grants or fellowships for uberhackers to sit home and clean up the code. But I doubt Redhat, Ubuntu, the Linux Foundation (since it is not part of the kernel), and the rest of the industry will do something like this. Note that GPG had/has the same problem - most package managers use gpg authentication.

I think I indirectly said that the eyeballs need to be skilled in that not everyone can refactor to the quality plateau, but that means more money to hire enough of those who can.

codinghorror · April 10, 2015, 5:46am

Excellent discussion so far. I agree with @prasun that better automated tools would be nice, but I think the community doesn’t pay much attention because the task is so immense, the fraction of security flaws automation can discover is small, and nobody has any illusions about the fact that automation will never discover many classes of exploits.

Anyway, this article about a modern OS X exploit highlights the amazing power of reading the code as it leads to better and deeper exploit development…

Even if the code here is assembly.

ABlekh · April 10, 2015, 6:00am

I don’t think the security of open source software (especially, infrastructural) is a matter of money. Rather, I believe that it is mostly the matter of establishing the right governance mechanisms for critical open source software projects. I briefly touched on this issue in this answer of mine on Quora: http://qr.ae/diooK.

s0beit · April 10, 2015, 7:07pm

Money was always associated with exploits, the issue was the exploiters were either publicly disclosing for recognition in the security community only or they were enticed by the money in the black market for 0days.

A lot of security people sold exploits before bug bounty, the difference is nobody bothered to try to squeeze any from corporations. It was a joke.

You’re right though, those with bug bounty programs are definitely scrutinized more heavily, but that’s basic economics. It isn’t as if there is no incentive to find bugs in services which are critical without specific bounty programs. Back in the day, I would have loved to have found one for the recognition alone. Plus, there are non-specific bug bounty programs these days like the hackerone “The Internet” bug bounty program, which pays out for critical vulnerabilities to a wide variety of critical resources.

Anyway, the point is, money for exploits will exist whether we want them to or not. The difference is they’ll be sold to the black market only rather than disclosed to companies. I’ve made some bounty money and it has made me research things I never would have before, and has generally made me better at researching other things in my spare time.

You have a point that they might be drawing attention away from other services, but those services aren’t flat out ignored. Disclosures look good on resumes. Anyone wanting to make a name will do it.

And sure, this all sounds very opportunistic, like, I have not once mentioned “what about just helping them out for the good of society!”, realistically, nobody really gives a crap about that. Nobody in the entire world, not enough to forsake their real life responsibilities, anyway. What about “helping them because you like the project”, well, that’s another matter. A hobbyist security researcher may do just that, however, it’s far less enticing than money and wanting to contribute out of the goodness in their heart.

Honestly though, a lot of security people don’t really care about open source projects all that much. It’s the thrill of the chase, and breaking things that is enticing. Not helping, unlike open source.

Sure, I know a lot of people who want to contribute to open source security tools which aid in breaking stuff, but that’s very different from considering a disclosure a contribution to an open source project in the same way code contribution is. A lot of us simply do not feel that way, and honestly I think it makes us better at what we do.

We don’t have a particular drive to share beyond what gets us recognition for being a good security guy. We like to be recognized for the things we do, but by other researchers, not developers precisely.

It is a far different culture, in my opinion, and I really don’t think what you’re suggesting is going to work.

EDIT: Final word, if somebody tries to “ransom” bugs, post their name on a shame list or report them to the police. They fully deserve it.

DavidLeppik · April 10, 2015, 8:44pm

There’s a balance between how much formal analysis a language supports and how cumbersome it is to use. The more you specify about your code, the more a computer can verify about it. We’ve seen big changes in the last decade, as type inference and null safety have taken off in modern languages.

Type inference: In trendy languages like Swift, Scala, and TypeScript, rather than telling the compiler what type a variable is, the compiler usually figures it out from how it’s initialized. When that variable is returned from a function, the type of the function is inferred. And the inferences continue as other functions return the result of that function. The result is strict type safety with the ease of a late-binding scripting language.
Null safety: Many newer languages discourage or disallow nulls. Scala has wrapper classes (None and Some) to replace nulls. Groovy, C#, Swift and others have operators for short-circuiting nulls or supplying default values.

If you want to get a sense of where we’re headed, try using Kotlin in the IntelliJ IDEA IDE. IntelliJ’s secret sauce for the last decade has been deep static analysis built right into the IDE, so it can flag bugs (e.g. unreachable code) as you type. Kotlin is the language they’re developing to replace Java (in their own code), and it’s built around static analysis. For example, if they can prove that a variable is a certain type, it is automatically cast as that type.

Preparing for Kotlin is also changing how the IDE treats Java. Kotlin requires variables that can be null to be labeled. With every new IDE version, the static analysis assertions for Java get stricter. At first the IDE was just encouraging me to annotate when a method can or can’t return nulls. Now it pretends to mark up the code with assertions like “if given null, return null, else never return null.”

Bringing this back to security, languages are getting smarter about providing safety without being too cumbersome. But none of these automated tools will ever fix security. For one thing, computers can’t tell a security hole from an intentional feature, and they can’t tell attackers from regular users. Especially since the attackers are trying to look like regular users. For another thing, attackers will always exploit the weakest link, which may be a buffer overflow, a confusing UI, or the user’s willingness to believe a lie.

A_Russell · April 16, 2015, 5:11am

First time posting, so I just want to say that this is an excellent blog! I don’t agree with everything on here (I’m not as big a fan of tablets and the like as Atwood) but I can see where you’re coming from on just about everything.

As for this post, I was thinking: you don’t necessarily need to use money as the (only) incentive. I imagine for the vast majority of us, every program we use has certain weaknesses that really affect our work flow. For simplicity, I’ll just talk about features, though the usual “don’t necessarily give the user what they say they want. Instead solve their problem” caveat applies.

So one option to provide incentives would be: If you find a bug, then we’ll move one or more features that you’d like to have higher up on our priority list (or more generally: We’ll make it a higher priority to improve some aspect of the program that you feel is lacking). The bigger the bug, the higher the bump in priority.

Now obviously this would have to be within reason. If used blindly you could easily run into feature creep. Plus some features may not actually be good ideas, or may not mesh well with your program, or it may just not be viable to implement.

While this does have some of the same issues as money (I don’t want to tell you about my bug, because then your feature will go up, not mine!), I don’t think the issues are as severe. For one thing, improving the program helps everyone (unless the feature is really really niche), whereas paying one person only helps that one person. For another, I imagine that it can be easier for people to find common ground on aspects of a program that need improving. For a third, it’s nowhere near as high stakes.

T_E_D_ · April 16, 2015, 4:42pm

Not a huge fan of this particular post, sad to say. It conflates two things, the first a genuine insight, but the second a misunderstanding.

The bit about not all OSS users having the expertise to contribute is spot on. Speaking as someone who has run a couple of smallish Free Software projects, you will find that only a very small percentage of your users are capable of becoming contributors. It varies wildly by project too. For an API (where users are presumably all programmers), it might be on the order of one in a hundred, while for an application it is more likely to be one in thousands. But of course an application is liable to have far more users, so it balances out (if you ignore requests for free personal support).

However, saying that the presence of a single (even a single major) bug in an Open Source project disproves Linus’ Law, is a fundamental misreading of the point of the statement. The idea is that it is far easier to find and get fixed bugs in software you depend on if there’s some access to the sources outside of the development house. It was formulated as an absolute (by ESR) probably either to make it more pithy, or because ESR is a guy who likes to think in absolutes.

Put it this way: if we somehow had a magical historical wand that could retroactively make OpenSSL’s market position occupied by a proprietary solution, would this identical bug not have been possible? Would the proprietary solution have had less such bugs, or in fact more? There’s no way to tell with this particular hypothetical of course, but from what I’ve seen dealing with other software of both types in my career, my money would be firmly on more (and what’s worse, you’d be at someone else’s mercy to get the damn bug fixed).

This isn’t about producing perfect software. Its about producing better software.

joepie91 · April 18, 2015, 3:39pm

I’d be very careful making such “nobody” statements.