It might be possible to overdo it, but trust me, I have had 20 times the problems with denormalized data than with normalized.
With normalized data, I had a complex join with records in the millions each and had no trouble retrieving it in mere seconds from a 450 MHz server that they were still using. I just added the appropriate index (took about 45 seconds) and problem solved. That 2 minutes of work turned into a year of consulting because I fixed it so fast.
But don’t get me started on denormalized data. Data is stored willy-nilly throughout the application when it should be a code table and then, inevitably, someone wants to do a search on it. Except that people misspelled it and added The at the beginning or , The at the end or put 2 spaces in it a couple times, etc., etc., etc. At that point, your ENTIRE dataset that the company has spent years gathering is WORTHLESS!
People don’t care what you put into a database. They care what you can get out of it. They want freedom and flexibility to grow their business beyond 3 affiliations. (Oops, you wrote select * in your code, now you might have trouble adding columns.) And have you ever tried to write a query to find everyone in the same affiliation? Hope those are codes and not free text. And I hope the affiliations are indexed if you want any kind of performance on that search. But now you need 3 separate indexes instead of one. And the third one might perform poorly because it is mostly null and whatever codes are typical to people with 3 codes.
I could go on and on about the real-world problems I have faced with denormalized data, versus the lack of problems with normalized data.
Say what you want in your blog, but a lifetime of experience says that you are wrong.
And, I’ve worked with Dare before and he’s a cool guy (so no offense Dare if you read this) but he writes tools for Microsoft programming languages and stuff. He doesn’t maintain software with 10,000,000 row database tables on a daily basis.