Unit Testing vs. Beta Testing

Why does Wil Shipley, the author of Delicious Library, hate unit testing so much?


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2005/10/unit-testing-vs-beta-testing.html

I also tend to agree. About 15 years ago I worked at a shop that had the formal “QA” department along with the traditional Engineering/Programmers. The management (very poor) had allowed the developers to literally lob code over to QA to “test”…without ever testing it themselves. About half the time, it was kicked back because it wouldn’t build (long before the modern IDE like VSN or Eclipse). Our releases took longer and longer to see the light of day because the testing group was doing all this nonsense. Customers were griping about how bad our software was. Heads roll and in comes a new VP of Dev. First thing he does? Cans the whole QA group. Why? His opinion was that all QA does is delay the release of buggy code. Now the developer was responsible for finding obvious bugs (like not building) plus “unit testing”. Then we released it to a select few customers that were beta sites and had an incentive to test the stuff better than anyone could imagine. Time to market improved, quality improved, satisfaction improved, developer quality improved, just by getting rid of the “testers”. Wonder what ever happened to that place? Gonna have to Google it I guess…

Eric Sink has a nice article this week about his experience with stress testing. He shows that “some” automated testing can be good. I’d agree. It’s just the binary attitude of the software world that gets to me. It’s not about do I test or not. It’s about how much testing should be automated and how much should be manual (programmer/QA/beta). But then, we all know that don’t we? :wink:

I agree somewhat with what you’re saying. But if you have a “unit test writer” and a “developer” such as your example scenario with the guy from Sun, then the person who setup that arrangement entirely missed one of the major benefits / motivations of unit testing:

The act of writing a unit test is more an act of design than of verification.

For a more detailed comment on this, read Scott’s blog post: a href="http://geekswithblogs.net/sbellware/archive/2004/08/01/9174.aspx"http://geekswithblogs.net/sbellware/archive/2004/08/01/9174.aspx/a

Regards,
Michael

One of the reasons for Test Driven Development is to design your code as it will be used. In my experience it has lent itself to some very elegant and maintainable code.
The greatest benefit that I see with unit tests is in maintenance. When I go and change some arbitrary method somewhere, and because I have unit tests running after source control check in, I will know right away if that code change just broke something unforeseen. That’s where I’ve seen unit tests save man hours in QA or new production bugs.

What about regressions?
What about thinking about how your code will be used before you write it?

I switched from a hash table to a dictionary some time ago, one of the big problems was that a dictionary will throw if the key is not found.
Unit Testing allowed me to find all the places where I relied on a hashtable returning null if the key is not found.
I’d zero bugs because of this change.
Can you really say that you can cover any decent amount of your code each time you make a change without using unit testing?

Interesting points but anyone trying to compare beta testing to unit testing or saying that it is an either/or choice really doesn’t get unit testing as it’s usually referred to today. Also someone that worked strictly as an “unit test writer” hasn’t actually done unit testing in the way that gives it’s greatest benifit. Perhaps it would help to compare Automated Testing to Beta Testing but Automated Testing is the apple to Unit Testing’s orange.

You need both unit-testing and beta-testing, plain and simple. There are many great resources on what modern agile test-driven development is but just writing a unit test after the fact isn’t it.

As a side note experience matters very little if it’s not in the correct field. If a programmer has spent his 30+ years writing the best structured (vs. OO) Fortran or C known to man then I bow before his wisdom (seriously) but his thoughts on OOD or TDD or Agile or whatever I give as much weight as the 16 year-old still in high school (which is actually higher than you might think). 21 or 15 or 2 years of “programming experience” means nothing until given specific and exact context.

It’s about how much testing should be automated and how much should be manual (programmer/QA/beta)

That’s what I think too. A certain amount of automated testing is clearly a good thing (eg Ayende’s example), but it’s always best to err on the side of real world, real user beta testing whenever possible.

If you have plenty of time time to code in unit tests and perform an extensive beta test period, then by all means do both!

Otherwise favor the beta testing heavily.

As a side note experience matters very little if it’s not in the correct field.

This guy ships apps that people use.

I weight that real world experience far more heavily than some programmer buried in the IT department of FacelessCorp, writing software that few people use and even fewer even WANT to use.

And I say this because I’ve been a FacelessCorp employee!

He shows that “some” automated testing can be good. I’d agree:

http://software.ericsink.com/articles/Crowd_Test.html

This is a really interesting point. He’s talking about Load Testing exclusively. And NONE of the bugs he found (some rather severe) would have been uncovered by code-level Unit Tests!

To be fair, it’s unlikely a beta test site would have uncovered these issues, either. Not enough load.

I guess the lesson here is that you can’t expect to attack testing through any one method. A combination of multiple methods is necessary to ensure proper coverage!

A) In my experience, writing unit tests makes me go faster as a developer, not slower. Why? Because I’m not spending time debugging.

B) No-one ever said that unit testing was all you need to do. At the very least, you still need end-to-end tests, and you need manual exploratory testing for the qualitative judgement only a human can give you.

C) With manual tests, you still want to automate tests for regressions; manual regression testing becomes a massive time sink over the life of a long product with multiple releases. If you’ve already invested in a testing framework, testing regressions is cheap.

Can you buid software without unit tests? Oh, you bet. I’d just argue that it is less efficent. I’d also argue that I’ve seen lots of test environments that don’t “take like 100 man-hours of setup time”, don’t “suck down a ton of engineering resources”, and prevent “any particularly relevant bugs.”

And I say this because I am FacelessCorp employee!

I think it’s obvious that testing must occur, whether it’s good and righteous TDD and automated Unit Testing, XP Paired Programming or just the developer banging on the app like a monkey.

But it’s my experience that unless the “test department” is closely managed and lives inside a mature process, that they typically provide limited value.

Not talking stress testing, configuration/deployment management or full environment testing. These are things (in my experience here at FaceLessCorp) that are not done well by the developer/programmer.

But unfortunately what a lot of QA groups return are “bugs” that are really them interpretting the original unclear requirements different that the developer, them imposing their own set of arbitrary standards, or them pointing out either merely obvious issues or completely corner bizarre issues.

None of which are completely valuable to me the developer.

There’s nothing wrong with unit tests per se. There’s something seriously wrong with people solely relying on them, assuming that the program built from pieces each of which passes an unit test, will be bug-free. Or, better said, these people are badly misguided, forgetting that unit tests are just part of the the wast world of testing - integration tests, functional tests, performance tests, stress tests, carry on.

I think that the best advice for unit test zealots would be just to go out and buy (or download :slight_smile: any book on software testing, take a time to read it, and open their eyes to the whole world they’ve been ignoring.

You always say “double-clicking a button” here, “paste too much information” there.

So we’re talking only about UI-development? Good, because I successfully use unit-testing for our product that is mostly running on the server-side without any UI to be used.

If you have clear interfaces a unit-test is the perfect tool to check if your implementation is fitting the description. If you implement the test at the same time like the implementation of the interface, it’s not taking much more time than the pure implementation (it’s growing with the implementation itself) and it can already tell you during the time of the production of the whole system (maybe including an UI) if something is now broken after e.g. an optimization of a part of the system.

Why should I start a testing of the functionality of an UI if I know that the underlying functionality is broken?

Unit-testing is the way I can ensure that a specific functionality is working without the need of an UI to be able to start that functionality anyway. So if a later implemented UI using this functionality is reporting an error it’s more likely to find the bug within the UI than the server-code doing the work.

Best regards, Lothar

Anyone who thinks unit testing takes extra time hasn’t done unit testing properly. Anyone who has sat and only written unit tests has not done unit testing properly.

Similarly, anyone who thinks unit testing is sufficient testing for any application is incorrect. One of the main generators for unit tests should be bug reports, ergo other testing should be happening.

It is important to note that unit tests are not the same thing as automated tests. There are many other sorts of tests that can be automated.

Wil Shipley is clearly confused on this last point

I have found unit testing to be critical in saving time. I work at a shop that has a formal “QA guy”. Releasing a newly added feature to QA without unit testing costs time and money.

I am using VS 2005’s web tests to conduct automated regression testing. It may cost me a little more time up front, but it saves the organization a lot of time in the long run.

Ironically enough, I touch on this subject in an article I just finished:

Project Postmortem: The Right and the Wrong
http://tod1d.blogspot.com/2005/10/project-postmortem-right-and-wrong.html

This guy ships apps that people use.

I weight that real world experience far more
heavily than some programmer buried in the IT
department of FacelessCorp, writing software
that few people use and even fewer even
WANT to use.

chuckle you must have been a Faceless Corp programmer for a very small company then. Many applications written for niche or speciality fields are used by millions of people. Hospitals, the mortgage industry, medical imaging, etc. Trust me, it doesn’t really matter if someone wants to use your software, they have to and it better work or else someone is getting screwed out of large sums of money or losing their life.

Also I believe you are confusing my experience comment. I’m not disregarding his experience, simply saying that 21-years as a programmer is only so much chest beating. 21-years shipping successful software works, 21-years using a variety of testing methods in medium and large scale products also gives me a better opinion of what he does.

Not sure how this turned into a popularity contest though when we are talking about software testing methods. I had a bone of contention that someone saying they have 21-years of experience doesn’t mean much to me, though having a successful application that is working does (many props to him for that).

I also believe there are different dynamics at work from a one person controlled, high UI-based web-application and large-scale multiple developer projects (from 5 to 20 people spread across time and space).

The author is comparing apples with oranges, and actually I found his post to be insulting and very short-sighted. Just because he works mostly on apps and UI doesn’t mean that’s all we other fellow programmers do.
Most of the low-level libraries he’s building his awesome beta-tested software on are probably very-well tested using regression and/or unit tests.

I do use unit tests - but not for everything, and always in addition to functional/user testing.

I find unit tests are very good for data access layer code, mappers, core classes, software factories, web services and that sort of thing.

When you get higher up the stack the tests start to mean less. I tend to write more unit tests in lower layers, and then move more to functional/user testing at the layers where things are being combined together and/or presentation layer.

I have been thinking though that the usefulness of unit testing could be enhanced by the introduction of probablistic based unit testing. I have worked on a number of risk models in my capacity as a geotech engineer, and while I don’t want to get into a detailed description of what that was all about, it was fascinating work.

Basically it involved constructing a model from a set of equations for mine subsidence. In any geotechnical modeling there are a large number of uncertainties, so deterministic models often aren’t as realistic as they could be.

So: what you do is replace values in the model that are somewhat ‘fuzzy’ with probability distributions, then run the model in a monte-carlo approach.

The reason you do this is that there are often sensitivies in equation systems that are not apparent: interdepenancies based on values that cannot easily be seen. For example, a slight change in the output of one equations may have a drastic effect on the overall system’s result, where a change in another equation may not have much effect.

Using probabilities and monte carlo allows thousands of realizations of a system to be made - which helps expose where the senstitivies in the model are.

The same approach could be applied to unit tests - rather than using static tests, the inputs to the tests could be replaced with probability distributions. The inputs would therefore have a random nature to them. Depending on the type of testing desired, the distribution could be setup to produce values that could land outside the allowed bounds of the class - thus randomly producing invalid input into the class.

This would then in effect allow a more close model of what a ‘real’ person would do in testing - putting a range of values into classes, and inspecting the results.

This would likely be even more valuable if tests were applied against classes farther up in the layers - where you might feed probablistic inputs into several classes that are used together, and see what the results are.

Monte-carlo runs on this would then help flush out senstivies on class collections that otherwise are difficult to expose, and normally only possible to find when the application is fired up and run by a user.

How to impliment this so that it is not bizarre and complex? I am not sure yet :wink:

But that is only an issue of time and will…Would certainly make a great OOPSLA presentation topic!

The paper “Unit Testing in Practice” (M. Ellims et al., ISSRE '04) had some interesting data on the effects of unit testing in three industrial software projects. They found that not only did unit testing detect large numbers of bugs, but just the activity of designing the unit tests found bugs. In fact, more bugs were found this way than by actually running the tests.

In two of the projects, unit testing was tacked on at the end. They found this reduced its effectiveness by about 50%. In other words, it appears that, of the bugs that would be found by unit testing, half of them would escape the rest of their process.