HTML Validation: Does It Matter?

Got halfway through the page when I gave up on reading all of the comments, but (for the first time ever) I feel compelled to say I also think you forgot something very important in your article:

Yes, it’s true that most browsers render almost any website - whether they’re good or badly coded - but not all browser render bad code in the same way. How can you make a cross-browser website that way?

I personally spent many, many hours on trying to get websites to look the same on any (major) browser. When you (and browsers) don’t commit to a standard (be it HTML or XHTML of any version), how can one then create a website that anyone can see in the same way - namely, like you wanted them (your visitors) to see it?

So, immediately I began to wonder: Is anybody validating
our JavaScript? What about our CSS?

Well, the JS interpreter validates JS. There’s also JSLint (http://jslint.com/) which performs a stricter form of validation.

Regarding CSS, the W3C have a validator http://jigsaw.w3.org/css-validator/

Personally, I believe it is worth the effort to write HTML which validates as being at least XHTML transitional. This guarantees that the code is valid XML, which means I can use it with XQuery, XPath, XSLT, etc.

Although I frequently attack standardistas who fixate on minutiae of HTML and CSS, while effectively ignoring software engineering, scalability, design, usability, business workflows, marketing, expedience, and all the other factors that weigh in on real websites, this post is just plain irresponsible.

Web standards exist to reign in the MADNESS that us long-time web people lived through 10 years ago. The amount of lost productivity could only be measured in the billions. Thankfully, due to the standards movement, it’s now possible to write flip articles like this. But be careful not throw the baby out with the bath water. Standards are what bring us to a world of less browser testing, greater accessibility, and ultimately greater productivity. A general vote against validation is a vote for increasing the number of development headaches we’ll face in the years to come.

Looks like I got to the party late. OH well, here goes anyway…

To all you people, maybe even the author, saying, Who Cares?, remember to bring this attitude up when interviewing. I damn sure wouldn’t want you working for my company. You are all the type that wrote those crappy, spaghetti-code VB apps aren’t you?

Back in the day, visual tools could have helped Mom and Pop write valid HTML. Today? I know of no one, outside of web developers, who writes HTML any more. They use blogs, Facebook, etc.

Shitty markup is dead and lazy morons that still produce it should be fired and sent back to the McJobs they are qualified for.

It’s a matter of what you think your responsibility is to the wider developer community. If you’re within some very small percentage instead of just a small percentage, things improve. As the old saw goes, it’s never the same small percentage. Go from 99% to 99.9% and you reduce the possible set of global quirks by at least some factor, maybe not the full order of magnitude, but enough to make a difference in how difficult it is to create browser tech, and that’s one more strength for the web to leverage for its growth.

Maybe you don’t care personally. Certainly that’s been the tone of this blog of late. But it’s worth considering.

Google actually ranks it’s indexed pages. The more valid the (X)HTML of your pages, the higher it’ll appear in a search.

Wow, a lot of soap boxes out there…

In the theoretical world, yes, there are standards. However, in the practical world, if the standards are not enforced then there are no standards. I find I’m more productive if I live in the practical world and not the theoretical world - and no, I’m not lazy, just efficient.

Jeff, keep up the good work.

If you do write XHTML, you’d better get it right. I heard about CodeProject practically dropping off the map because of an XHTML error that caused Google to stop ranking them. Also, I’ve heard Safari can refuse to render XHTML that’s not according to Hoyle. Apparently spiders and browsers expect sloppy HTML, but if you say you’re writing anal-retentive XHTML, they take you at your word.

Go from 99% to 99.9% and you reduce the possible set of global quirks by at least some factor

Given everything I’ve learned to date about browser idiosyncracies, I find it very hard to believe that validating as HTML 4.01 strict will make any difference whatsoever.

For one thing, validating HTML says nothing about the CSS or JavaScript that drive most sites these days…

  1. validating as xhtml 1.0 transitional is pretty easy nowadays, strict less so. However, try writing css that validates and works in internet explorer. If you can’t write xhtml that validates, there’s something wrong with your tools. But this isn’t about bashing Microsoft again. Okay it is, they’re an easy target.

  2. Ever wonder why web browsers are so hard to build? Why they are chock full of security bugs? Ever considered writing your own web browser? I didn’t think so. The reason so many xhtml parsers are fail fast is because we want to get away from this liberal about inputs world and return to a somewhat more sane fail on bad inputs world. Even writing a simple html parser for web scraping purposes can be such a waste of time because you have to assume the input will be invalid.

  3. It’s not terribly important for your site or your users to be good about (x)html validation but it is for the community at large. Which basically means everybody will start caring around about the same time pigs start flying.

I don’t find writing proper (validating) XHTML strict any difficult, once you learn what is allowed or not. To me XHTML makes more sense than HTML 4, if we are going to write pseudo XML then why not write real XML ?

I’m not by any means a front end guru, but I would think that valid html would get much more important when it came to 508 compliance. Screen readers will get confused a lot more easily than browsers, and invalid html can’t help the situation.

Know the rules, then break them if you have a good reason to. It’s as simple as that.

Nobody cares if your HTML is valid
As pointed out if indexing and your google ranking does not matter then take Jeff’s advice of it is up to you otherwise validate your pages.

The reason so many xhtml parsers are fail fast is because we want to get away from this liberal about inputs world and return to a somewhat more sane fail on bad inputs world.

The sooner we can smash pandora back in that darn box, the better off we all will be!

I dunno, there’s a fine line between thought leadership and tilting at windmills. Validate if it’s important to you. But realize that it’s a small part of the overall equation.

if we are going to write pseudo XML then why not write real XML ?

Indeed, why write a comment when you can write a novel? Why pilot a boat when you could captain a battleship?

No, really, why whould you expect target= to work in strict DTD? It’s a frmes related attribute and as such is valid in frames DTD only.

And as to it being harmless… No, it isn’t. It’s a quite basic usability breaker. See, normal link (no target=) opens wherever I want. If I want it in the same window/tab it will open there. If I want it open in a new one – it will open there, as instructed. But there’s no fscking way to tell your browser to not open it in a new window/tab if you don’t want it to.

@Vordreller and John: citation needed.

Apparently spiders and browsers expect sloppy HTML, but if you say you’re writing anal-retentive XHTML, they take you at your word.

Even if that’s true, virtually nobody does that, anyway. Serving true XHTML is more than simply adding a doctype or a xmlns attribute, and if spiders relied on just that, easily more than half the websites which claim to use XHTML (via doctype) would be dropped from their index.

The opinions in this post are very similar to my feelings on using FxCop in .NET development. FxCop finds a few important defects very quickly for you if you’ve been fixing violations all along.

But, if you never fix the Specify IFormatProvider and Specify CultureInfo errors, you’ll never find the Exceptions should be public errors that bite you in the future. I think electrical engineers refer to this as the signal-to-noise ratio.

I fully agree with this post: HTML validation, as it currently stands, is only useful for finding structural errors in your code, such as mismatched tags.

For one thing, lots of things that are illegal are, as you said, impossible or impractical with valid code. Javascript elements can only appear in certain places. Anchors can’t have targets. There are other examples. In many cases the validator is essentially wrong, since no browser implements those rules. I believe HTML 5 is relaxing the rules a bit because of the way the browsers actually work.

Other things are just noise warnings for HTML: URLs are supposed to be SGML escaped (i.e. - amp;). This never matters in practice, so why do it?

I do agree that td width=80 is wrong, however, and so is td style=width:80px. The width should be declared in the CSS for the page. I wish HTML validators could highlight style= as an error, to make it easier for me to find hard-coded styles.