@Dennis Forbes:
By the way, there are two Juan’s in the room, I’m Juan Zamudio, the other one is just Juan.
While agree that most of the recent posts are not that valuable I keep coming back hoping I can read another great post like in the good old days, but I find interesting that you come back for more, and after almost two years you have not found the time to remove codinghorror from IGoogle given the fact that you dislike this blog. That’s the point want to made.
I also didn’t see the point in that Google-fu that you mention, that bring nothing to the table.
PS; I’m not a Jeff Groupie, i found this blog by accident also (searching something related to the code complete book, I’m a McConnell whore, i have to admit that).
Disagree with Jeff on his own blog! GENIUS! Then you can gain the attention of a bunch of people who through survivorship-bias (in that they continued to read it) are going to likely be fans of Jeff’s!
Somehow I don’t think that strategy is a very good avenue to fame. Gosh, I’m going to have to rethink this.
Simple, just go against everything that a well-known person says even if they are completely right
I disagree with Jeff when I disagree with Jeff (somehow CodingHorror got on my iGoogle page, and I’ve been remiss to remove it. And every now and then I expand one of those nodes…). If this hurts your precious feelings, I would advise that you stop reading the comments.
Guys let’s take this advice and ignore this guy.
This is like those YouTube channels where people put a big notice at the top disclaiming that they don’t care what anyone thinks, which of course means that they desperately care what everyone thinks.
Honestly I think Jeff should disable comments, because his biggest fans are his worst enemies, and they are the reason he gets often undeserved backlash. It’s like some sort of weird little groupie festival.
text or text (or double) are no good choices for markup in an environment that is full of bad C code. Go for of some sort and sanitize the database by escaping the old posts of course. These * and _ will just annoy everyone.
While I see the point Dennis is making, I often go with Jeff’s approach for testing
I go with the ‘Dennis’ (smart) approach always (mind the markup). THEN I always go and bruteforce test in as many ways possible. I’m always surprised at at least one edge case I missed. I try to make it a habit to think why I missed the particular case initially. That way my reasoning + hitrate improves.
First, the folks debating how to make sure the ‘*’ block is surrounded by either whitespace or the start/end of lines … doesn’t the regexp library being used support ‘word boundary’ matches? “\b” is usually it (http://www.regular-expressions.info/wordboundaries.html).
Second, I echo the concerns of trying to handle this as a regular expression problem, when it’s quite obviously a language grammer parsing problem more likely to be satisfactorially solved using BNF or PEG grammar.
Third, and most importantly, why are you eschewing libraries which are out there to do exactly this? I mean, one of the advantages of using a quasi-standard like markdown is that everyone and their mother has made a parser of some sort for it already. Don’t waste time reinventing the wheel!
You’ll need to use something like ANTLR to generate your C# parser code from that .leg file, but that should be a WHOLE lot easier than even what you’ve already done with regular expressions.
Fourth, I think the use of two different ways to do a very simple thing (’’ and ‘_’, and ‘**’ and ‘__’) is Just Plain Wrong. Provide one way to make bold, and one to make italics. Makes it less likely we’ll hit the other case by mistake. IMHO, the '’ is the most used one and least likely to cause problems.
Finally, I agree with other posters that markdown’s choice of ‘’ for italics and ‘**’ for bold is braindead (sorry, Gruber!). It should have been ‘/’ and '’ instead. But, at this point, markdown is markdown, and you don’t want an exception on your one site.
I’ve never quite understood why simple HTML markup is considered “inhumane”. What, really, is the difference between these:
italic italic
iitalic/i
Why come up with some complicated regex filter to convert some contrived markup to HTML, when the original HTML was designed to be simple and human readable to begin with?
In almost all cases where I’ve seen this “markdown” style of formatting, there’s some big filter up front that automatically strips out all possible remnants of HTML as part of some cargo-cult security mechanism. Why not just modify the HTML filter to allow basic bold and italic tags through?
The basics of Markdown – the parts that Jeff is trying to capture it seems – do have a certain elegance, paying homage to a less advanced era: When all you had was ASCII, it was generally agreed that could emphasize certain words, and draw attention to others, with nothing more than appropriately place characters. For those with such a habit, Markdown semantically draws from what they are use to.
I have seen a lot of sites that allow either Markdown, HTML, or some other bastardizations. The back-end process was always Markdown (where used) -> HTML -> correctness checker, so it is a concise set of code.