Web Development as Tag Soup

As we work with ASP.NET MVC on Stack Overflow, I find myself violently thrust back into the bad old days of tag soup that I remember from my tenure as a classic ASP developer in the late 90's. If you're not careful bordering on manically fastidious in constructing your Views, you'll end up with a giant mish-mash of HTML, Javascript, and server-side code. Classic tag soup; difficult to read, difficult to maintain.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2008/07/web-development-as-tag-soup.html

Toss in another vote for wicket. It’s seriously by far one of the best, if not the best framework I’ve worked with. There is no tag soup, your logic peanut butter doesn’t get into your html chocolate, and everything is nice. No, seriously. Go try it now.

Ouch, Dave… this reminds me of one of your classics You can write VB in any language

Now you are not taking your own advice… tag soup is completely avoidable in ASP.Net… but it’s also accessible. Your choice.

For those concerned about MVC replacing Web Forms…

http://www.misfitgeek.com/Will+ASPNET+MVC+Be+The+Main+Web+UI+Platform+For+ASPNET.aspx

If I’m served tag soup as a starter, I would expect the main course to be spaghetti code.
(And by main course I mean the rest of the application)

But done well, a nice template language beats WebForm’s runat=server all the way.

I don’t think Jeff understands webforms ASP.NET paradigm, if he did he would have noticed that it is the ONLY web framework which managed to completely eliminate tag soup. And it is actually old news because Microsoft did it in 2002 with the first release of ASP.NET.

All those pieces have to be present to get rid of tag soup:
a) codebehind
b) html/server/web controls (for abstracting html and referencing/changing it in codebehind)
c) event-driven page-rendering pipeline

Ruby Envy my ass.

The code is for the machine, not for the person. As always, there is an uneasy balance between readability, scalability, performance, and well-formedness.

Personally I think using a custom-tag based approach to develop reusable components a la JSP taglibs, faces and ASP.Net controls… You convert something like this:

lt;tablegt;
{for (Row row : rows) }
lt;trgt;
lt;td%gt;{row.value1}lt;/tdgt;
lt;/trgt;
{endfor}
lt;/tablegt;

Into something like this:

lt;grid col=rowsgt;
lt;col field=value1gt;Value1 Headerlt;/colgt;
lt;/gridgt;

It really simplifies things and it centralizes a lot of the html (esp. if you use things like JSP-based taglibs written as JSPs–very easy to modify).

Having the ability to introduce macros (which is essentially what sorts of controls are) can really clean up the UI code. And if you’ve got a halfway decent web guy it’s also something they can be comfortable working with using their familiar toolset.

I think if you want to do this properly, you will end up at parts of compiler construction. If you really want to get rid of some tag-soup-stuff, you basically need a code generator with HTML (or XHTML) being the target language.

That basically implies that you define yourself a certain intermediate language. This language definately depends on your application, but in general, it ends up being some sort of sequence of lists of trees (and all that recursively).

This page would probably be a sequence of a blog entry (which consists of a list of paragraphs, which consists of marked up words, and so on)m a list of comments (which consist of paragraphs, and so in) and a way to input data.
I am not entirely sure how abstract you can get, and how abstract you want to get.

This abstract representation is generated by some database layer and passed to a code generator.
This codegenerator then just works like assembly generators or similar things. At first, it has to output a certain prelude, that is, CSS, the header of the page, news, blogroll, CSS, Javascript and all this nonsense. After this, it generates the proper HTML for the text, the comments, and generates a form for user inputs. AFter that, a certain exit code must be generated, in this case, it would be the body end, html end-tags.

That way, the abstraction from HTML would be as large as possible - it would be no problem to generate plain text, or even a certain UI-application.

I just think the problem with such a model is that its pretty much different from what most people think if they build web applications. Most say I pull this statistical data from a database and generate that and that code in order to output, instead of Right, I need a certain HTML-Generator that translates an abstract tree into HTML, and then, I need something that turns my relational data into such an abstract tree in order to output things. Oh well, its like that all the time if you apply compiler-internal ideas to other problems, even though the solution looks nice currently.

Greetings, Hk

I currently work, as a designer, for a large e-commerce site in the UK, and they don’t use a template system like this. All the html is coded in functions within the web application.

The result of this is that the HTML and css is coded, by the design team, separate from the main application. This is then passed to the development team who must then rip it apart and stick it all in functions.

Not only is this wasted effort, the developers often get it wrong and break the html structure. Which then has to be ran past the design team again to figure out what the problem is.

Another problem is that if there is not development resource, then there can be no changes to the sites html. This means you can’t get your design team to freely change the interface, as you require.

Not good. Templates are a much better solution for large sites, where you are likely to need a design team.

I hate even more frameworks that make you pretend you’re writing a desktop GUI app and then generate a lot of convoluted and malformed HTML full of Javascript that happens to look like what you wanted.

Everything under my domain name passes validator.w3.org. With the exception of some extensions from the Web Forms 2.0 spec, which no browser supports anyway (but I want to avoid chicken-n-egg: browsers not supporting it because websites don’t use it; I take the first step).

I only saw one reference to Flex, and no mention of Silverlight, but
I believe our only hope is that these next generation technologies
can free us from the chains of our real enemy, the browser
environment itself!

I will third Flex. Haven’t used Silverlight yet, but the promise seems the same.

I only found web programming platible once I could approach it like regular app development, with regular compiler and debugging support.

No tags at all, and one develops a distinct app that happens to run in the browser and communicate with the server – a nice clean separation that is easy to program with.

Now whether the user likes it that much is another thing, depends on the app really.

I’ve always found it ass-backwards to have the code embedded in the HTML instead of the other way around. I can write a readable program that emits HTML.
—scott

I usually don’t use code like that.

In PHP (Jeff, I know you hate that :slight_smile: I usually write a block of code at the beginning of the file, and then I set up some variables to print in the page template, just like this:

?php
$title = getTitle();
?

html
head
title?=$title?/title
/head
body

/body
/html

You chose a very bad example for python. Python supports templates by default. I tried them using pylons, you have simply to create a template like this:

html
head
title$title/title
/head
body

/body
/html

As you have the model, the controller and the view separated (MVC), when you call a template you can replace the variables on the fly:

t = getTitle()
render(’/template.mako’, title=t)

Things are very separated here, aren’t they?

Well, I guess that’s the price you pay for jumping on the MVC bandwagon… You could have avoided that by using the out of the box, event driven ASP.Net.

But then this post would probably be a rant about viewstate or unreadable/non-css compliant id attributes.

It always surprise me how some can call their solution the right solution for web development.

Some people argue that too much logic is in the code that Jeff posted above. But have you actually read the code? It shows page number, empty list text, etc. That is not logic, that is just user interaction and usability. Try writing the same code with asp.net form controls without using GridView, ListView, etc. For me logic is the Create Product and the Calculate Income methods.

ASP.Net is great, but ASP.Net forms is just terrible for anything but microsoft only standard asp.net forms. You tend to worry more about page cycle, viewstate, etc. then the actual pages.

anyway I would probably for the sport implement my own template system if that I could find did not satisfy me. A framwork does not have to be brain surgery.

I third the Haml recommendation. Jeff - try Haml (NHaml, in your case: http://andrewpeters.net/category/nhaml/) . Just write a single page and see what you think. The syntax weirds people out at first but once you’ve tried it, it is hard to go back.

Here, briefly, are the basic options in ASP.NET. (only looking at inline versus server controls and code-behind)

I never minded the inline for the output of simple values that are only for viewing. e.g. label controls and viewstate are overkill. It makes easy stuff quite easy to read what the layout is with

h1%= _movie.Title %/h1
h3%= _movie.Director %/h3
span class=description%= _movie.Spoiler %/span

and alternative can be, when databound (asp.net)

h1%# Eval(Title); %/h1
h3%# Eval(Director); %/h3
span class=description%# Eval(Spoiler); %/span

or another alternative is to use server controls when you need more control and keep the markup readable.

h1asp:Label id=lblTitle runat=server //h1
h3asp:Label id=lblDirector runat=server //h3
asp:Label id=lblSpoiler runat=server CssClass=description /

in the ‘code behind’ file, at some point in the page life-cycle, you’d write…

lblTitle.Text = movie.Title;
lblDirector.Text = movie.Director;
if( string.IsNullOrEmpty(movie.Spoiler) )
{ lblSpoiler.CssClass=missing; lblSpoiler.Text = Resources(MissingSpoilerText); }
else
{ lblSpoiler.Text = movie.Spoiler; }

The reason for the inline styles is allowing for non-coders to change the markup without knowing the code. Obviously this can go too far when there’s too much tweaking that needs to be done, but the idea is flexibility. Personally I don’t dig Eval, because it’s always better to catch errors at compile time, but sometimes easy works just fine. Not every page will get your undivided attention, and it does have the utility of showing what is being used.

The other benefit of inline styles in ASP.NET is (usually) you can alter the output on a live system. If you do everything code-behind then you have to recompile and redeploy. [I say ‘usually’ because in Web Deployment Projects you can specify that pages are not updatable at runtime]

The problem is that real websites, with real logic, get messy quickly no matter where the logic goes. But when things get messy, it’s easier to refactor in code than it is to refactor in html.

%# (Eval(Prop) != null ? (Eval(Prop)) : Resources(PropNotFound); %

One thing in ASP.NET that I still find a bit persnickety is single and double quotes inside inline tags, and obviously this following is quite hard to read. I can’t remember examples offhand, but there are still times inlining fails inside of some tags, but not others.

a href=’%# Eval(Url) %’ title=’%# Eval(Title) %’%# Eval(Name) %/a – watch ’ versus !

At least MS, to their benefit, does allow flexibility. You don’t always have to do it the same way, which is very handy. You can embed the name of events in the aspx file, but you don’t have to (you can attach the delegate at runtime). You can just make a normal tag runat=server and then it becomes a member of the page. It can be a source of errors and frustration (knowing exactly where something is done) but that’s an issue of DRY and following good coding practices.

Sometimes I curst that it’s harder to write some forms of javascript because all the control names are mangled (based on the container they live in), but I get over it.

@Wheelwright: I don’t think Jeff understands webforms ASP.NET paradigm, if he did he would have noticed that it is the ONLY web framework which managed to completely eliminate tag soup.

ASP.NET not only isn’t the only framework that can avoid tag soup, the example provided shows that it didn’t eliminate it; it made it possible to avoid (mostly). Many of the current books on ASP.NET still show tag soup examples.

@mj1531: WebObjects (which inspired Tapestry) solved the server-side tag soup problem …

WebObjects is an incredible web application framework, with a thriving community. It does eliminate tag soup. And it does so many things so well. Thanks for bringing it up.

But regarding tag soup in general. ASP.NET has the tools to avoid tag soup in most cases. There are a few instances where dropping code into HTML would be hard to avoid, but it’s definitely the exception, not the rule.

So, if you find yourself doing tag soup, realize that you’re probably doing yourself more harm than good. Sometimes going back and refactoring to avoid tag soup is harder than doing it right in the first place.

Even languages that seem to promote tag soup, such as PHP, can avoid the mess if good decisions are made up front.

Coding horror cliff notes 2008-07-20:

Look, there is a problem. I don’t know the solution, do you?