Is HTML a Humane Markup Language?

Why invent something new when there are so many reasonable choices? I agree with those of you that say that an HTML posting syntax would be ideal. If that is not possible for security reasons, please don’t invent something new. Let me leverage the time spent learning textile or markdown or whatever existing markup technology you decide to use. My time is valuable and I’d rather spend it conveying a message rather than learning a new way to format a message.

addressing those wiki points -

  1. HTML focuses on content, not presentation - semantic html let people focus on (or sometimes even gain deeper understanding of the format of) their ideas.

  2. Why use domain-specific markup when you’ve already got global markup that serves all your needs?

  3. Tables aren’t any less difficult to understand then the puzzling mixture of dashes, asterisks, and brackets that wikis employ

  4. No - they don’t need it, and you don’t have to give it to them.

  5. Only if you leave yourself open to it… “we’re too lazy/busy to address security concerns” is not a good reason.

  6. What makes Wiki markup easier to learn then HTML? Why would you learn a new markup language, which will just get converted back to HTML again? Isn’t that a little redundant? If people need to learn a markup language, why not learn the one that is universally used in every page on the web?

… I’m not a big fan of wiki markup either - bbs tags are only marginally better.

Just a thing: a way to get code coloration is, I think, necessary. Seriously.

Also, bbcode blows (and 9 out of 10 bbcode parsers are purely regex-based translators, thus break down real fast), thanks for not using it.

I generally agree that a subset of HTML is fine for formatting. If all you want to do is have lists and paragraphs and bold and italics, it’s exactly as clear as almost any other markup language. If you’re willing to automatically add p tags on double newlines, most people can muddle through without touching it at all.

However, Mediawiki is a special case, in that the html tags don’t actually fully represent what most of the corresponding markup means. As you say, it represents the structure of the data, and the structure you give the data using wiki markup has side effects beyond the formatting you’d get from basic html.

For example, take the ‘triple equal sign’ - on first glance, it’s just an h3. That doesn’t tell the whole story, though - there’s some deeper meaning to that tag. Not only does it do your h3 formatting, but it also generates a named anchor, and it automatically appends a link to it in the table of contents. It does have the same logical meaning as h3, but it does more - h3 is a subset of triple equal. You could of course impart that power upon h3, but I’d argue that’s even more confusing than having a separate syntax.

This doesn’t even touch on the templating language or the category system, both of which have no equivalent in html. So with mediawiki, you know html won’t meet all of your needs - so coming up with a language that does allow for everything only makes sense.

I usually love your posts… but this is exactly the kind of attitude which stops developers from making good UI imo. You’ve obviously thought about this a lot, but you immediately ruled out all of the best approaches by making a big assumption about your target audience.

Just because you expect every good programmer to be comfortable with markup doesn’t make it so… and as you often remind us, there are plenty of bad programmers out there.

Maybe I got it a bit wrong… but I don’t think you should expect your users to understand your markup, or even HTML. Showing the markup and allowing the user to edit it is fine, (ala wiki) but not implementing nice buttons and an interface… you shouldn’t demand anything of users that isn’t necessary imo.

I’ll take it all back if you plan on having the buttons as well… but thats not how the post came across. :slight_smile:

As much as the bugs in Blogger annoy me, the one thing they do right is to allow the user to go to source and edit the HTML. For the users that don’t understand markup, they have a WYSIWYG editor.

Reinventing a markup language is the wrong approach. I’ve been creating HTLM pages since 1994, and every time I edit a Wikipedia page I roll my eyes because I still have to look-up that URL syntax. I agree with Calvin, use REST if you want to implement “automatic cross referencing functionality” but remember, that is a server-side function, not a mark-up issue.

One recommendation, I would create a White List of HTML you will support. This way you don’t have to try to manage a Black List of restricted tags.

It seems to me that an assumption is being made that all developers know how to code in HTML. As a desktop developer I rarely, if ever, have to touch web code and hence will have to invest time and effort into learning a whole new ‘syntax’ if I am expected to format my posts correctly.

While I understand that there will always be a need to have some kind of markup I cannot see the reasoning behind forcing us to hand craft it. If I am popping onto the site to post a question (or indeed an answer) then I probably have that problem space loaded up in my brain. Having to interrupt that and find out the correct way to markup my post seems like a sure fire way of reducing the integrity of that post.

I can see no logical reasoning why you feel the need to forgo a simple GUI driven text box, that requires minimal thought while using, in favour of forcing us to learn whatever you choose to be ‘my way is the best way’.

That just smells of elitism.

One of the books on your reading list says it all - “Don’t Make Me Think: A Common Sense Approach to Web Usability”.

I’ve always liked 37signals’ solution - give them just a few whitelisted tags for bold, italic, links, quotes. Forget about attributes. Keeps it pretty clean, and they can explain it in a sentence.

In the spirit of various other articles on this very blog, wouldn’t the correct answer be to allow both html and simpler markup?

I know HTML inside out, but given the choice I’d rather write in textile whenever possible.

“this is a site for programmers, so they should be comfortable with basic markup.”

Well, at you put your prejudices involving what a ‘real programmer’ is right there where everyone can see it.

Isn’t the point of software to make things easier? Just because as a programmer, I can code in HTML doesn’t mean I want to code in HTML just to write a comment. WYSIWYG is not a bad word.

Although, for a developer site there is a very limited set of markup needed:

  • Plain format (the default)
  • Lists (ordered/unordered)
  • Bold/Italic/underline
  • Hyperlinks
  • Sourcecode (the biggie for a programming site)

One nice thing would be color formatting sourcecode automatically based on language.

If I were You I’d definately go for plain old html but whitelist only a bunxh of tags and limited attributes per tag, and ofcourse validate properly before accepting anything. I’ve done it before and it’s rather simple.

Yet I strongly disagree with the whole “If you’re a programmer, you damn well better know HTML” thing; I might be comfortable programming for the web but that doesnt mean all programmers are. I still know some that only do specific FoxPro based stuff, or even just a small subset of c on embedded devices. These people are definately programmers, but don’t have any reason to know anything about HTML.

Therefor, and because I am not you, I’d still go for a nicer custom wysiwyg editor to generate nice lean and valid HTML with a code button for advanced users perhaps. I have this love for the KISS principle, but I realize it applies to my users’ point of view, not to mine.

Kris

I find the simple formatting^ offered by the Australian broadband forum Whirlpool to be quite good. It isn’t as full featured as many others but in most instances, its more than enough - they conveniently allow you to enter in raw HTML if that suites your purposes better as well.

^ http://whirlpool.net.au/wiki/?tag=whirlcode

I’ll note that the blog Making Light uses a subset of html for comment markup, including urls, and the users there seem to have no trouble figuring it out. The users are typically science fiction nerds, but at least 2/3rds do not come from technical backgrounds, but when sufficiently motivated can even figure out html, with prompting.

I’ll append a section of prompting, but I had to mangle the angle brackets to abide by the “no HTML” rule you have for comments. (Irony!)

HTML Tags:
[strongStrong[/strong = Strong
[emEmphasized[/em = Emphasized
[a href="http://www.url.com"Linked text[/a = Linked text

Spelling reference:
Tolkien. Minuscule. Gandhi. Millennium. Delany. Embarrassment. Publishers Weekly. Occurrence. Asimov. Weird. Connoisseur. Accommodate. Hierarchy. Deity. Etiquette. Pharaoh. Teresa. Its. Macdonald. Nielsen Hayden. It’s. Fluorosphere. More here.

Have you tried out markItUp:

http://markitup.jaysalvat.com/home/

It’s an excellent javascript utitity that puts a friendlier face on the standard textarea. I use the HTML version to allow my users to enter snippets of XHTML, but it also works with MarkDown, Textile, etc.

I also wanted control over what tags and attributes the users are allowed to enter, so I wrote an extension that validates the XHTML by parsing it against a list of valid tags/attributes (defined in JSON).

I’m wondering if you have put any though into automatically supporting syntax highlighting for code snippets? I’ve used several sites that do this (forums.devshed.com, for example), and while it’s rarely perfect it can really help when reading through posted questions.

A vote for Textile. If you just want to type, and be able to enter some bold text, lists, headings or links, it works very naturally. Novices have no problem with it either (I use it as the default formatter for a CMS backend). I prefer it (slightly) over MarkDown because I find the way you enter headings in it clunky.

Textile also makes quotes beautiful: ldquo;Like this.rdquo; when you enter “Like this.”

Doesn’t the conclusion here go against the reasoning in the “XML: The Angle Bracket Tax” post a couple of days ago? I know you can get away without all the closing tags in HTML, so it is slightly better than XML, but to willfully twist your own words:

  1. Should HTML be the default choice? The authors of most styled text entry code developed that would probably say NO to this.
  2. Is HTML the simplest possible thing that can work for your intended use? NO.
  3. Do you know what the HTML alternatives are? YES
  4. Wouldn’t it be nice to have easily readable, understandable posts, without all those sharp, pointy angle brackets jabbing you directly in your ever-lovin’ eyeballs? Ummm, Yes?

As pointed out above, whatever you do it isn’t going to actually be HTML. You’re going to have to add your own stuff to it and limit it in some ways. I admit my HTML knowledge is basic, but I’ve no idea how to enter some syntax highlighted javascript in HTML, but i can manage it in mediawiki syntax.

Your link to “why doesn’t wiki do HTML” is broken. It should be

http://c2.com/cgi/wiki?WhyDoesntWikiDoHtml

I vote for html format with a WYSIWYG editor, such as one of these:

http://www.geniisoft.com/showcase.nsf/WebEditors

Just because I know html, doesn’t mean I always want to type it or any other markup to enter a comment. All of these editors I have looked at supply a Design (lazy) mode and a raw html mode.