Well, that just begs the question - why the blog post in the first place? And why recommend a book that you haven’t read and/or didn’t understand? Not cheap shots here, I actually want to understand why you took that credibility risk with the people who know this stuff - which it turns out is a lot of programmers.
In all honesty, despite reading all the above comments and the wikipedia entries, I still don’t quite understand the definition of NP-complete other than as we’ll know it when lots of experts have seen it, tried, and failed.
Have you really read all of the above comments? Not just scanned over them but read them? You only know if a problem is NP-complete if the experts have seen it, tried, and proved that it’s NP-complete. There’s a precise definition and a set of criteria to demonstrate that something is NP-complete.
I’ll have a crack at some explanations, I don’t know how much help they’ll be. I’m sure the book you linked to probably explains it better.
A Turing machine is a simple model of a computer. In its usual form its only output is to accept or reject and input, so instead of asking it what a + b is, you normally ask it if a + b = c - ie it solves decision problems.
A problem is in P (Polynomial time) if for all instances of a problem with size n (n = 100 if we have 100 objects to sort, for instance) the Turing machine takes a number of steps to come to a decision that is a polynomial in n - eg there’s an uppper bound of 2 * n^5 + 4 *n + 5.
A lot of the time when we talk about the size we’re talking about the number of bits - adding n-bit numbers is in P - O(n) - and so is ultiplying n-bit n-bit numbers - O(n^2) the simple way, although FFT methods do better.
This is where I back up for a second and talk about non-determinism. A regular expression is just a non-deterministic finite automaton or state machine. It has a state, and the next state is determined by the current state and the next piece of the input which is read.
Say we’re scanning a bit string rather than a character string, and we’re after strings that have a 1 as their tenth last bit. Using a regular state machine we’re going to need about 2^10 states to handle this.
This is where we add non-determinism. We allow the machine to go to several different states based on the current state and the next piece of the input. Once we’ve read a 1 we’re in a state with two paths - one leads back to the state we’re in, and one leads to a string of ten states which all transition to each other on all inputs and end with an accepting state.
Rather than having the machine encode 2^n states we have the machine split into 2^n in different multiverses and we take a big OR of the results. The machine took n guesses that the current piece of input was the tenth last, and we were able to save some space describing the machine.
As a bonus you can easily implement a NFA engine with only a polynomial size increase in space / time for the processing over the deterministic equivalent, so they’re both in P but I digress.
So we’re processing the input of a Turing machine. Now we add non-determinism, so at every step the machine can split - so we have 2^n virtual-multiverse machines, if any of them say yes to the current decision problem then the answer is a yes. If the running time for each machine is bound above by a polynomial in n, we have a problem in NP.
That’s why there’s a polynomial size check (or certificate) of all NP problems - it’s just an encoding of the n guess that were made to get to the accepting state. You can take the original non-deterministic turing machine and at each place where you would normally guess the certificate will tell you which path to take and guide you to the accepting state. It’s like factoring the product of two large primes is hard but it’s easy to check if you have one of them already via the mod operation.
The NP-complete problems are assumed to be the hardest of the NP problems. It all starts with SAT3 - just take it as a given for now that SAT3 is the first in the list of NP-complete problems.
Take a candidate for a second NP-complete problem. Using a polynomial number of copies (A) of this candidate problem and a polynomial sized amount of Turing machine glue code (B) try to show that you can solve SAT3 (or something else on the list). If you can, the candidate problem is NP-complete and you can add it the list.
So if solving NP problems takes exponential time, we have A * an exponential in n + B, and it takes a really long time to solve SAT3 and every problem between your problem and SAT3.
But if one of them can be done in polynomial time, we have A * a polynomial in n + B, and it takes polynomial time to solve SAT3 and every problem between your problem and SAT3.
You may have spotted that so far this means solving SAT3 in polynomial time doesn’t do much for you. NPC is defined as the problems that are in NP and reduce to all other problems in NPC, and so for SAT3 to keep its crown you also build the tree in the other direction.
It becomes a crazy tangled web but it’s clear that if you can solve one of the problems in polynomial time you can solve all of them in polynomial time just by stepping through the web from the P-time problem towards the problem you’re after.