It has been said well by others already, but I feel strongly enough on this issue to add in my two bits:
Denormalization is the last possible step to take, yet the first to be suggested by fools.
And, yes, I just called both Atwood and Obasanjo out as fools.
It is dangerous to make that suggestion so willy-nilly, as people take posts by prominent authors as something they can rely on. Even hinting that denormalization should be anything but a last step means that these guys simply cannot be relied upon as authorities in this space.
So, for those still feeling unsure after reading through all of the above:
Views are not about perf/scale, they are about query clarity and/or abstraction of true data source structure.
Materialized/indexed views can help perf/scale, but have limitations in terms of the underlying data structure.
Optimization must always be based on freshly-collected metrics, should go roughly in this order (with the order slightly adjusted based on your particular applications and databases and the issues you have previously experienced), and should be done one at a time with fresh metrics collected between each:
- remove indexes no longer deemed necessary
- add indexes newly deemed necessary
- simplify views where possible
- optimize queries, especially to reduce scans
- reduce application mis-use of the database (unnecessary and repeated queries are a common culprit)
- add/change application-tier caching of database information
- materialize/index views where appropriate
- optimize stored procedures
- create generated search/reporting tables
- investigate fundamental issues with the design of the database schema (data types, relations, etc.) but not from a hey, let’s denormalize perspective.
You’ll notice that denormalizing your fundamental schema isn’t even on the list. It is too tempting and dangerous to put it there. True, it may be a last resort if all of the above fail, but if all of the above fail you are almost surely having the kind of problem that everyone would love to have: your application is going gang-busters, and you can afford to throw more hardware at your current model (start with increasing spindle count first by getting transaction logs onto their own spindle(s) and then by getting indexes onto their own spindle(s)) and if that still isn’t enough then you are are into Google/Amazon/etc. territory and can get some doctorate types to start designing their own persistence model for you. They’ll probably cook up something decidedly non-relational, and with much denormalization. But if so, then why not just up and denormalize? Simple: The other possibilities need to be exhausted first, through repeated execution of the above list of steps, until you can be certain that violating the correctness guarantees of your relational database through denormalization is worth it.