The Future of Markdown

Phanley · October 30, 2012, 12:00am

I’m skeptical of markdown, and it isn’t because I don’t understand it - it’s because, yes it’s a nice quick way to write formatted text… until it isn’t, and all you want is some combination of two elements like indented but not a quote or code but the leading few words to be bold… or whatever, and suddenly 20 minutes is wasted trying to figure out why the markdown editor is being so stupid.

https://github.com/hanleybrand/mdid3-sysadmin/wiki/shell_plus-and-poking-around is a page I was working on recently where I had a number of “WHAT” moments when I’d check the page formatting - I get that it’s an edge case, but it’s one where I think I would have been better off using FCK to style html for me if I couldn’t be bothered to write the html, which frankly isn’t that hard anyway.

And that’s why I’m skeptical of markdown. The simple utility is nice enough, but we already have a very parsable text format in HTML - and the wysiwyg editors that are current are very nice and can be configured to write pretty structured html. And as much as people kvetch about html formatting surprises, at least I only had to know the foibles of one markup language. Except now people agree I have to know two, and the other I have to know in terms of itself, and how it renders to HTML, which is what everyone seems to do with it anyway.

So that makes it seem a lot less great, in my view.

Youri_Lima · October 30, 2012, 12:00am

should we say “AYE” or something?

A_C · October 30, 2012, 12:00am

I agree wholeheartedly about Markdown. In fact, I created this (currently incomplete) website that advocates using Markdown for, at least, web content: http://markdownforcontent.com/ – I think it does need some leadership and a set of standards that we can hopefully all get along with.

An important point about Markdown: I think it should be fully readable in both pre- and post-transform states.

D1118 · October 30, 2012, 12:00am

I hope that they don’t forget RTL languages

JeffreyD · October 31, 2012, 12:00am

Like most things, the problem isn’t the technology. It’s the people. You would have to get all the people on the internet to agree on it. It’s a lot more challenging than just having something that works well. It’s gotta be fun, useful, trendy and all that jazz.

DaveP · October 31, 2012, 12:00am

Jeff, are you going to leave it like this?

For those interested, the community group is open to anyone to join,

See http://www.w3.org/community/markdown/

I’m disappointed Jeff.

CodesI · November 2, 2012, 12:00am

Changes I’d like to make to markdown:

Allow Markdown in block tags, especially inside tables
Better backtick escaping rules. The rules for having backticks in backtick enclosed inline code are weird ATM. I suggest simply doubling a backtick to escape it.
When numbering items in a list, the rendered number should be identical to the written number.

Evan_Plaice · November 2, 2012, 12:00am

I see what you did there…

def change_the_world():
  if (platform == inconsistent || platform == irritating) {
    incite_the_community()
  wait(changes)
  return feeling_accomplished

def incite_the_community(platform) {
  article = complain about the shortcomings of {platform}
  for(i=0, l=influential_people.length; i<l; i++):
    article += helpful suggestion
    article += {influential_people[i]} should get involved
  blog += article
  return

Do you still program computers or did you get bored and decide to dedicate yourself to full-time social hacking. It must be nice to get to the level of accomplishment in life where you can focus a lot of energy on coming up with good ideas rather than being inundated with implementing them.

If your thought process even remotely follows the ideas that Joel Spolsky presented at his Google Talk back when StackOverflow was still a ‘new thing’, your suggestions subtly hint of social engineering.

Please don’t take my remarks the wrong way, I mean that in the sincerest respect.

As for standardizing Markdown, it’s more of a social than technical problem. Already, a very loosely defined pseudo-standard has been released into the wild and there is a greater ecosystem that revolves around many of the independent implementations (obviously). What you’re seeing is a result of the ‘bike shed’ effect. While it’s somewhat difficult to create a complete end-to-end Markdown solution, it’s much easier to create a 90% solution with custom exceptions. I wouldn’t be surprised if John Gruber is unreachable because he’s fed up with hearing color suggestions for the shed.

No matter what you do the majority of the shed painters will never fall in line with a standard. I’d bet that you have heard of ‘Confirmation Bias’ before. Let me present exhibit A.

To break through that the specification and the implementation need to be of a high enough caliber to hijack the ‘Markdown’ namespace. If that can be achieved then every other iteration will be considered ‘just another copy’ and the non-standard branches will wither. W3C did it with the HTML spec, Apple did it with electronics design, Google does it with everything they can. I’m not a Sci-Fi enthusiast but even I know that ‘2001: A Space Odyssey’ is the ubiquitous reference for all Sci-Fi. It begs the question, why is that?

The interesting thing about OSS projects vs commercial one is, in OSS the community becomes the currency. The larger the community is, the better the feedback, the faster the code quality will increase. Conversely, the better the quality the more people the project attracts. After a certain point the success of a project becomes a runaway effect. At least until somebody screws up (the project gets forked) or the platform the project is built on becomes obsolete.

I see your inspirational troll but I like technical pissing contests as much as the next guy…

First, for all the people who advocate the use of LL*, ANTLR, or equivalent language generators, take a minute to consider the excessive amount of overhead those approaches create. You’re talking about building a complete AST (Abstract Syntax Tree) with a tone of intermediate memoization for what should essentially be a simple top-down parser.

It turns out that Chomsky was a pretty smart guy.

That may ‘work’ on local/browser implementations but on the server-side it won’t scale for shit.

I would argue that Markdown has a simple enough grammar that it should be possible to parse it with a Type 3 parser using a single char regex matching + FSA scheme. We’re talking, no AST and very very little overhead. The only memoization overhead expected is equal to the number of chars that are accumulated between state transitions (ie one string, no complex data structures necessary).

We’re talking a no frills implementation but it should be lightweight enough that further optimizations (ex inlining) will be rendered unnecessary.

The only exception to this is where code needs to be further processed such as the numbered link (which I really like) style that SO uses and syntax highlighting.

For syntax highlighting, it’s trivial to add an inline parser hook that can be leveraged for additional processing. For the numbered links you can do a mark and replace through a second pass. Which could be further optimized by marking string index positions on the first pass.

In lower level languages this could probably be optimized even further using non-null-terminated-strings (ie ones that contain a length prefix) but I’m no prolific C hacker.

If you’d like to see a Type 3 parser in action, feel free to browse the source @ jQuery-CSV. I created it because I wanted to complete the first 100% RFC 4180 complete parser written in pure javascript. The jQuery isn’t necessarily a dependency but if I’m going to go through all the effort to hijack a namespace, I might as well go for the biggest one.

It contains two minimal CSV-specific parser implementations, $.csv.parsers.splitLines(), and $csv.parsers.parseEntry() (the name should indicate what they do). Also, the library includes hooks to inline code into the parser stages for further processing (ex auto-casting scalar values).

I can’t really take credit for the idea though. The newest parser implementation was inspired by some very good suggestions made by the author of jquery-tsv. I didn’t even know what a Type 3 parser was a month ago. As opposed to the formally educated, I have zero formal education on programming; I’m just have a talent for picking this stuff up along the way.

Will all of the half-assed CSV parsers that can be found on literally thousands of blogs disappear overnight. Of course not. They will still exist but the power of branding is that a name can propagate much faster than a concept.

I’m not sure if somebody is measuring but I think we have a winner (me). Either that or my ‘confirmation bias’ is being a douche again. lol…

Andy · November 6, 2012, 12:00am

@Liam Thanks for setting up the w3c page but it seems to present a few barriers to entry by more formal disclosure than most fora

“* Employment Affiliation. You must indicate any significant employment relationship”

I’m not even sure I’m allowed to put the company name on stuff I do on my own time pursuing my own interests!

Andy · November 6, 2012, 12:00am

Another major plus for Markdown is its support now in Doxygen, since 1.8, which means when I write my pages describing software design I can do so in Markdown, getting instant previewing and export.

It uses the extensions from PHP Markdown Extra, and GitHub flavored Markdown. I’ve found the simple tables and fenced code blocks incredibly useful.

Full details at
http://www.stack.nl/~dimitri/doxygen/markdown.html

Dulk · November 6, 2012, 12:00am

Prove that ‘Go’ instead of ‘No’ mentality of the interwebs is stronger. Please.
Make it work. Adoption by non-techies like me will flourish.

asbjornu · November 7, 2012, 12:00am

@Andy Dent, you can choose “No Affiliation”, which means you’re only representing yourself.

@Evan Plaice, your arguments against standardizing Markdown could be made against standardizing HTML 20 years ago too. Now that we have a proposed W3C Working Group to handle the standardization of the Markdown language, I believe that’s the best way forward.

Shadow2531 · November 7, 2012, 12:00am

I hate that markdown doesn’t do bold, italics and underlining this way:

*bold*
/italics/
_underline_

where you escape *, / and _ with \ if you don’t want styling.

That’s how it should be as it matches text/plain email composition style. You should have to do 2 ** before and after to get bold.

Shadow2531 · November 7, 2012, 12:00am

“You should have to do 2…”

Shouldn’t I mean.

Evan_Plaice · November 7, 2012, 12:00am

@asbjornu

You’re assuming the W3C drives the direction of the web. They kinda lost that distinction with the whole XHTML 1.0/HTML 4.01 debacle. Last I checked the WHATWG was the driving force behind HTML5 not the W3C.

I would argue that they’re probably one of the worst organizations to handle the Markdown specification. They’re already too top-heavy. When it comes to creating a new spec the group behind it needs to be small, influential, and highly motivated.

asbjornu · November 8, 2012, 12:00am

@Evan Plaice, the problem with the stagnation of HTML was of the existing WG in place and its decision to deprecate HTML altogether in place of XHTML. WHATWG was created to continue the development of HTML since there was no way to start that initiative from within W3C.

Since there is no existing Markdown WG in W3C, I don’t think it faces the same problems as HTML did 8 years ago. While I agree with Anne van Kesteren’s criticism of the W3C Process, I think it mainly boils down to how good the chairs and editors of the WG are. If we go the IETF route, those chairs and editors are even more important (from my experience being involved in RFC 4287 and 5023 as well as HTML5); just look at the mess that is OAuth 2.0.

I’m not sure that creating an ad-hoc standardization effort without the backing of an existing organization is such a good idea either; microformats isn’t in a much better shape than HTML was 8 years ago. I’m not saying this is easy, I’m just saying it might be easier if we go with an organization that at least have a track record (although not perfect) than no organization at all.

Spike666 · November 8, 2012, 12:00am

I 100% agree with everything you’re saying. Markdown is spectacular and the fragmentation out there is doing it an extreme disservice.

I’ve been using Markdown for notes, documentation and even for writing essays. If markdown could be adopted by Google for Sites and by Evernote, I think it would take over the world.

With that being said, a friend is writing a book and showed me AsciiDoc (http://www.methods.co.nz/asciidoc/) which is like a way-complicated markdown, but adds a lot of nice features for basic formatting that I think Markdown is missing for longer documents. If I were to write a book, I’d probably draft it in Markdown, but then spend a week wrangling it into some other format (asciidoc? TeX?) so that it pages and footnotes properly.

A quick search in the comments and someone else mentioned asciidoc, too. The beat me to it.

Shaneknysh · November 13, 2012, 12:00am

I’m rather surprised that reStructuredText is getting so much airplay while multimarkdown has but a single mention.

ChristopheM · November 14, 2012, 12:00am

I just subscribed to the W3C mailing-list, as joining the group was a bit too complicated when your organization is not yet part of W3C (do I own any rights etc. hell if I know…).

Anyway, I very much second the initiative and hope this will go somewhere!

If this is more a people’s issue rather than a technical one, why not first have major parties at least subscribe to a common communication platform? (I also went to visit github/markdown but found no way to subscribe there…).

asbjornu · November 16, 2012, 12:00am

Christophe, you can join the W3C WG as an individual (as I did).