Jeff, I will bend the context of your article a bit to pretend that the definition for Q3 was simply “3) Scripting and regexes. The candidate has to describe how to find the phone numbers in 50,000 HTML pages” with no further definition supplied (a better test, I think), as I want to make a point about the assumptions that lie behind the questions (and the possible answers)…
Candidate 1: That’s trivial! I can do that in 2 minutes! (Failed)
Candidate 2: I’ve never had the need to use regexes before, but, hey, that’s what books are for. But first, I have the following questions… (Passed)
Why? Because Candidate 1 will give the appearance of being brilliant and producing quick solutions, but his solutions will probably be rushed, half-assed and ill-conceived.
Candidate 2 asked the following questions during the interview, thereby demonstrating that she understands real world problems and client needs. She is a capable solutions developer, not merely a programmer with a collection of theoretical tools and no practical understanding as to how they might be used…
-
How many of the telephone numbers are badly-formed, broken over multiple lines, or protected from harvesting by Java script or embedded as graphics files?
-
Are any of the pages Unicode and/or in other languages?
-
Once we determine the proportion of telephone numbers that cannot be extracted by regexes, what is an acceptable loss? Are we willing to lose all or them, none, or somewhere in between?
Now, drawing liberally from one of the comments above, the difference between the guy who took days/weeks and the guy who took 2 minutes is that the guy who took longer recovered 99% of the telephone numbers and the other recovered 25% (only properly formed numbers located in the USA stored in plain text).
If this were a privacy issue, for example, “we are required by law to not allow telephone numbers to be published or people can sue us into oblivion (or we can end up in jail)”, the yield is important. And, as we are bending the context for the sake of argument, we also assume Amazon, being an international company, wants international solutions to international problems and does not want programmers who are thoughtlessly US-centric (a plague infecting many international websites).
Now, if neither guy asked any questions about the scope of the problem (and it’s goals) before starting, both are idiots. One may have produced an inadequate solution that comes back to bite the company 3 years later with a massive lawsuit, the other may have wasted time and money unnecessarily by over scoping the problem.
Before anyone says, the scope of the problem is defined by the assumption that regexes are to be used: it is very common for employers to be determined to use Product/Method A to solve the problem (they only have a hammer and all the world is a nail) and it is up to you, as a seasoned professional, to advise them that Product/Method A may not be up to the job and Product/Method B should be used instead and explain why, clearly weighing up the pros and cons.
Sadly, you could eliminate a lot of candidates by asking: Do you pass parameters, or use global variables?; Draw me a data model for a database that holds questionnaires and their results; Do you know what a For-Do loop is and why you would use it?; What is a class?; What does ‘event driven’ mean and how does it affect the way you design your code?
I have met a disturbing number of in-house developers that would fail these questions.
“Watch out for one-trick ponies. Candidates who only know one particular language or programming environment, and protest complete ignorance of everything else, are a giant red warning flag.”
The counter to this is obvious: those who know many are experts in none. Sometimes you want experts, not dabblers.