Revisiting the XML Angle Bracket Tax

Occasionally I'll write about things that I find sort of mildly, vaguely thought provoking, and somehow that writing turns out to be ragingly controversial once posted here. Case in point, XML: The Angle Bracket Tax. I'm still encountering people online who almost literally hate my guts because I wrote that post. You'd think I kicked their dog, or made inappropriate romantic overtures toward their significant other.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2008/06/revisiting-the-xml-angle-bracket-tax.html

Hi Jeff,

I presume you’re using Web Forms for your views on Stack Overflow? Was this a considered decision? Did you consider any of the alternatives such as NHaml?

Cheers,

Andrew.

One thing I really like about your blog is your quotes on other people. I find those quotes to usually been the most reliable part of your posts, since not just you agrees with the guy you quote makes the information more reliable. Quotes are for highlighting points about a topic not about placing parts of other people (or even yourself in a previous publication) text into your own. That said I think you should have taken that part of your podcast rewrite it and place as normal text.

About the topic I find that it mostly does not matter, even more, I think it’s better this way because it’s standardized. Standards are good, even if they are bad. A stantard is better than each one having it’s own language and it’s own parser and so on.

Elaborating a bit on the Dan Gilbert book:

"The Futile Pursuit of Happiness"
http://www.wjh.harvard.edu/~dtg/Futile_Pursuit.htm

I hate to read XML, but, as Hoffman said, its standardized and almost any dev CAN read it. While it may benifit you to use something else, it will hurt the people who have to read that later.

I do agree that you need to be sane about what you use, but when its only a small difference I say take a hit for the rest of the community.

The other thing XML has going for it is LINQ-to-XML and XML literals in vb.net. There isn’t a much easier way to write/parse data in .net.

I’m actually using this summer to teach myself more about XML and what it can actually do for me past a few simple scripts for websites. Can anyone recommend me a good book for learning XML?

Hi Jeff. I’ve recently defended your blog to a fellow programmer who falls into your “passionate hatred” camp, and he’s not the only guy I know who thinks similarly. On the other hand, I know a lot of people like myself that recognise you’re just another bod. Personally what draws me here isn’t the technical excellence, but the fact you can repeatably string consistent articles together that can be read by the average joe programmer (me).

I think you’re overstating the mental parsing problem just slightly, and would almost dare to posit that if you can’t substitute “fruit=foo” with “fruitfoo/fruit” after a few years’ of Visual Studio, then you’re probably in the wrong business. :wink:

Another aspect is that this could be seen as a tools problem. For the past year or so I’ve reached the opinion that for any formally structured data, there almost certainly exists a more efficient, “humane” representation that should be implemented in a GUI for manipulating that data structure.

While there aren’t such things around (yet) for things like C or C# code, there exist quite a few XML editors that implement a number of different graphical interfaces to viewing/editing XML. The beauty in the generality of XML is that a user/programmer is free to pick from any number of different representations that he may use to manipulate the Infoset. The textual tags are just one widely used representation.

I’m going to share with you the first paragraph of Simon St. Laurent’s 1998 “Why XML” article:

“The computing press has found a new savior for the ills that afflict computing and the web: XML. XML is new, it’s exciting, and it’s got to be good, because the specification for it looks indecipherable. XML’s hype level has already drawn fire from some quarters, from those accusing it of ‘balkanizing the web’ or of increasing the load on an already strained Internet.”

10 years already. It does seem indecipherable at times (especially when you’re dealing with large XML content).

Here’s the link: http://www.simonstl.com/articles/whyxml.htm

While xml is horrible to read and isn’t going to make anyone happy, using anything else is likely to make someone seriously unhappy. Have you ever tried parsing a csv or similar proprietary file which has documentation that not only was lost years ago but didn’t handle the data type you are trying to add anyway?

The mental cost of reading xml is far outweighed by the benefit of being sure that you will be able to read it. Definitely a case of worse being better.

You should go one step further and prove your tax. Write a program using Visual Studio that yields the same result with each dataset. Wouldn’t it be safe to say that if everyone wrote their own programs to reach the same result that the XML version would be more consistent then the non-XML version because it’s based on a standard?

XML is generally excise. (Doesn’t About Face have an entire chapter on this?) When XML is presented to humans as the main means of modifying data or software state, you should be using something different (i.e., an actual UI). That said our content management system wouldn’t exist without XML and XSLT, and I love both very much. Our users are none-the-wiser, however. XML is the pain that software developers bear so our users may lead happier, healthier lives.

Once again, the issue is that XML isn’t meant to be parsed by a human. It’s intention is not to be human readable - the verbosity that is so annoying to a human brain (because we can interpret the meaning from context) is absolutely essential for software. Thus, I think the solution would be a translation layer for human viewing/editing of XML files. I’m sure that XML viewers/editors already exist (a quick Google search shows that they do). Maybe you should give one a shot? If you can get a plugin for Visual Studio, the entire problem would be solved.

My problem with XML isn’t as much the strain of having to read it; it’s more of how bloated it has become.

If I recall correctly, XML was derived from XHTML, which has it’s basis in HTML. So, in theory, XML is really just another text markup language. I’m not going to argue with the ability to create your own markup tags that can be parsed to mean whatever you want them to - quite the opposite, in fact. That feature is (hands-down) the most powerful aspect of XML.

Unfortunately, when you give people that much power, it inevitably goes downhill. Think of what XML was intended for (custom text markup), and now think about what it is being used for nowadays (configuration files, data transmission, data persistence, reporting, etc.) How much of the “usefulness” of XML is due to the ability to throw whatever you want into a file along with the rest of a loose collection of information, which might not even be relevant?

This doesn’t even begin to take into account the extra overhead associated with parsing, reading and writing the information as you said in you previous post. Add that into the mix, and (to me, anyways) the case against the widespread proliferation of XML grows stronger with each opening and closing tag.

So the question I pose is this: is the advent of XML as a universal data type (for lack of better wording) making us better programmers, or is it causing us to slide backwards into the olden days of placing everything having to do with anything into one place for “easier” access?

you must love wpf :wink:

Hi Jeff. I’ve recently defended your blog to a fellow programmer who falls into your “passionate hatred” camp, and he’s not the only guy I know who thinks similarly.

“The dogs bark: a sign that we’re riding, Sancho”. (Don Quixote, via Jorge Diaz Tambley)

Once again, the issue is that XML isn’t meant to be parsed by a human. It’s intention is not to be human readable

I desperately wish someone would explain this to all the people writing XML files. Oh wait, we have.

Wouldn’t it be safe to say that if everyone wrote their own programs to reach the same result that the XML version would be more consistent then the non-XML version because it’s based on a standard?

The idea that there’s only two choices: XML or “write it all yourself” is sort of… a lie.

YAML is based on a standard, too:
http://www.yaml.org/

The power of software development is that it is one of the most efficient methods of expressing our will. Once it was people being taught a process, then it was mechanically expressed in assembly lines, after that we had hard wired chips and now it has moved into software. But, no matter how this has changed, it has always been about the best method to express our will and the backbone of that is passing information efficiently. It isn’t about XML, Corba or whatever… as sure as XML is a certainty as a format to store data for the next 100 years, in 20 years we’ll look back and laugh. I think a good phrase here is, “Every 1000 years, the followers of the current mainstream religion look back at the followers 1000 years ago and ridicule them”. The difference for us, is that we see multiple changes like this within our own lifespan and yet, when we’re stuck in the middle of the current new fad, we lose perspective and somehow forget about the last 10 technologies which were the promised golden bullet.

How about the mental cost of learning the syntax of a bunch of new parsing languages? YAML, ini, bleh. I already know XML, why would I care to learn additional mechanisms for storing configuration/data persistence?

How about the anguish of working with immature and buggy APIs that parse these languages compared with the proven and stable apis that are built into Java/.NET? I don’t need an external DLL. I don’t need to unit test that piece of code. With XML, it just works right out of the box.

How about training costs? I lead a team of 5 engineers. I have not had to explain XML to a single one of them because they have either known it coming in (due to the pervasiveness of XML and .NET) or they were smart enough to look it up on the internet. Can you say that for the configuration flavors of the month your propose.

Jeff, I think your frustration comes from a lack of tackeling enterprise level apps. These rants are starting to sound like Joel not likely Exceptions or the need for a new language. You are so overly concerned with the little details that you miss the bigger productivity picture.

If you think about it from a Domain Driven Design (http://en.wikipedia.org/wiki/Domain_driven_design) perspective, XML is just a persistence layer. It’s unimportant and you shouldn’t be spending time on it. Focus on what matters - the domain.

The passionate hatred reminds me a little bit of some Firefox fans, for example. Don’t get me wrong – I use Firefox and I’m happy with it, but I don’t get into any heated discussions about it. It’s just a browser. But a quick visit to some random web forums, and you’ll inevitable see people turn into raging lunatics when they talk about how much better it is than IE, and how dare anyone say anything bad about it (or abbreviate it the wrong way, for that matter).

I know the word has become trite, but fanboyism is probably the best way to describe it. Whether it’s XML, or Firefox, or Ruby, or Linux, or Microsoft, or whatever. Use whatever you want–there’s no reason to feel threatened when someone else prefers something different. It seems as if a great deal of people are either insecure about their tools and software; or perhaps they consider it so much a part of their own identity they feel that a criticism of their tool is a personal attack.

Whatever the reason, that kind of reaction to your original post certainly speaks volumes about a person’s maturity level.

I already know XML, why would I care to learn additional mechanisms for storing configuration/data persistence?

I might ask you a similar question: why learn anything beyond exactly what is required?

I’m not proposing that everyone stop and rewrite every application written in the last 5 years, merely that people understand and are aware of the alternatives.

No one’s mentioned them in this comment thread, but they’re inevitable so I feel I should get it out there this time: Lisp S-expressions offer all the standardization and consistency of XML with far less syntactic noise. S-expressions were also conceived as a “machine format” as opposed to a human format, but they are eminently more usable. Why they’re not in wider use these days I have no idea.

Not much to say here besides that, but really - they’re easier to parse and generate for both computers and people. They’re lighter-weight and at least as extensible. Coincidence is not a good enough reason to maintain the use of XML over simpler, saner formats!

Isn’t XML just another “bug-ridden, slow, ad-hoc implementation of half of Common Lisp” with better marketing?

I really don’t mean this as a troll, I’m just so dissatisfied with XML that I react strongly when the topic comes up. Sorry!