Groundhog Day, or, the Problem with A/B Testing

On a recent airplane flight, I happened to catch the movie Groundhog Day. Again.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2010/07/groundhog-day-or-the-problem-with-ab-testing.html

In this case, it seems to be a case of not enough A/B testing. Give him a couple more decades, and I’m sure he could have figured out the correct way to invoke emotions in a woman.

I saw an early draft of the screenplay, which included scenes in a library, where Phil was able to move a book (which somehow didn’t get reset overnight) to act as a crude calendar, and the implication from these scenes was that he spent centuries re-living the day.

As for the overall A/B argument, it’s worth remembering that at the end of the film, Rita accepts Phil, because he actually has changed. A/B testing can achieve shallow local maxima, but it is not limited to that; you can also A/B test larger, more significant changes. The nature of the difference between A and B is up to the creator.

Love the idea, but I can’t stop thinking of what would happen if Groundhog day followed the reality of A/B testing…

  • Phil would date people at random… “Well Ned loved the fudge.”

  • Success for Phil would be waking up with anyone.

  • Phil would have a product manager who wan’t at any of his dates but who decided which feedback Phil was going to act on, and how.

  • Said product manager would believe that sleeping with people is a function of ice sculpting.

  • Said product manager would also be unaware of the repeating day which they are both stuck in.

Someone should point Eric Schmidt at this article. I can’t think of any better example than Google of a company employing endless statistical analysis to the most anal degree at the cost of any level of soul or beauty in their products.

41 shades of blue, anyone?

(P.S. No IDs on comments? Poor form.)

Well done, Jeff. This article has a lot of heart. Plus, the sandpaper quote is pure gold.

++Alan

Well, that was a well-written take down of a strawman you’ve called “A/B testing”.

A/B testing is “empty. It has no feeling, no empathy, and at worst, it’s dishonest”? Really? Rubbish. Marketing drones can be emotionless, cynical and dishonest. Anyone can. That they use A/B testing is neither here nor there. These naughty marketing drones might use other devilish tricks such as… fancy graphic design! Engaging copy writing! They may even offer a discount!

Oh, the horrors colors, words, and savings have wrought on us all. Down with empiricism!

Of course, if you have no ideas, no talent, nothing to say, and nothing to offer, then sure, filling the void with A/B testing isn’t going to make much difference.

However I believe there are lots of interesting, creative people who have more than one good idea. I believe these same people cannot read the minds of every single person who visits their web site, or uses their app. Therefore, I think it’s great that these people can test both their ideas, rather than having to make some evidence-free guess and rationalize it after the fact. An A/B test is only as good as your best idea, after all. Ideas still matter!

I think the really uncomfortable thing is A/B testing says you don’t know. I think some people find that hard to get their heads around. I’m smart! I’m about the love, maaan! I’m not a greedy marketing whore! I don’t need no stinkin’ A/B testin’!

Well, if your pride is worth more than the benefits you and your users get from A/B testing, so be it :slight_smile:

A/B testing is not used to find love, win over women, or make friends.
A/B testing is a tool created to score points, or in other words - make money.

Points, or money, are numbers.
Numbers = math. Math = science.

If your business is about creating friendships, I’m not here to judge, and I can’t help you measure success. But if your business is about scoring points, A/B testing is a necessary tool.

I always quote Groundhog Day as the perfect example of “Comedy that makes you think… a LOT”, and I usually watch it once or twice every year. It’s simply timeless.

Rita: "Would you like to have dinner with Larry and me?"
Phil: “No, thank you. I’ve seen Larry eat.”

An interesting assertion. I think it would benefit from an actual example of A/B testing, however, rather than just an interesting connection to a movie.

Hey Now Jeff,

A/B testing, great post!

Coding Horror Fan,
Catto

“Give him a couple more decades, and I’m sure he could have figured out the correct way to invoke emotions in a woman.”

Have you ever read ‘The Game’ by Neil Strauss?

@Michael Dorfman:

According to Dan Grobstein’s summary of the Groundhog Day commentary: “Danny Rubin [the scriptwriter] envisioned that Phil would relive the day for 10,000 years. In the original script he kept track of time by reading one page each day in the bed and breakfast’s library.”

Seems like a pretty clever solution to me.

@AaronSw Wouldn’t it be easier to remember one up-to-5 digit number, and one up-to-3 digit number, each day, than remember a book title and its page number, then have to count up all the pages of all the books already read just to know what ‘day’ it is? :wink:

Didn’t we do this yesterday?

A different slant on Seth’s Non-Optimized Life perspective:
http://sethgodin.typepad.com/seths_blog/2010/07/the-nonoptimized-life.html

I don’t entirely disagree with this post, but I will point out that sometimes a shallow local maximum is exactly what you want. Even if you’re going to do throw out your existing work and do a revolutionary design, you should figure out which subtle variant on that design works best. Don’t rely on A/B testing to break new ground, but don’t assume that breaking new ground is always the point of making a change to a product.

That about sums it up for me…

Your Groundhog day analogy makes me chuckle but seriously, I think the ‘soulless’ problem with AB testing only comes when you apply the AB testing “tool” dumbly with no particular goal in mind.

For example, applying AB testing to choose which background colour to use I would regard as “soulless”, it’s just testing opinion with no proper goal/s. However, what if I decide my goal is to increasing the number of replies to questions posted on my Q and A forum site. I use that goal to brainstorm out and then test ideas, then my test/s have a good honest and valuable purpose. I would not then call my AB testing ‘soulless’.