If You Like Regular Expressions So Much, Why Don't You Marry Them?

All right... I will!

I'm continually amazed how useful regular expressions are in my daily coding. I'm still working on the MhtBuilder refactoring, and I needed a function to convert all URLs in a page of HTML from relative to absolute:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2005/03/if-you-like-regular-expressions-so-much-why-dont-you-marry-them.html

We use TOAD (only for ORACLE dbs), which has an entire module dedicated to this…

http://www.quest.com/toad/

but there are potential performance issues in recompiling the regex on each postback

They could just embed the precompiled regex into their assembly, but at the framework level of the coding pyramid it probably makes more sense to hard-code it.

[… did you know that the ASP.NET page parser uses regular expressions? …]

And another trivium is that the ValidateRequest logic that looks for HTML and other “potentially dangerous” markup in the postback does NOT use regexes. It could – it’s a natural application – but there are potential performance issues in recompiling the regex on each postback.

I have a few details about that here, including the hypothetical regex that they would use:

http://mikepope.com/blog/DisplayBlog.aspx?permalink=441

I have been looking for a tool to beautify SQL for quite some time. Where could I find them?

Anyone got a regex that converts an absolute path into a relative path? I need to convert “http://domain.com/folder/images/myimage.gif” to just “images/myimage.gif”.

Erica,

Here are a few… starting with:

http://domain.com/folder/images/myimage.gif

Return the last filename in the url:

“[^/]+[^/]$” – myimage.gif

Return the last folder in the url:

“[^/]+/(?=$|[^/]+$)” – images/

Return the webroot:

“^\w+://[^/]+(/)*” – http://domain.com/

Return the webroot plus the first subfolder:

“^\w+://([^/]+/){2}” – http://domain.com/folder/

The problem you run into is that relative is… uh… relative to what?

Sorry , I 4got to add this information.All the links are of the form:
"a href=“disable_javascript:mapWindow=x_window.open(’/something.htm’)”

I need to convert these relative urls to absolute ones. I know that i have to use the Replace() method.How do i go about it??
I went thru the code that is posted above and didnt understand this line:
“html = r.Replace(html, “${attrib}=${delim1}” _HtmlFile.UrlRoot “/${url}${delim2}”)”

What does _HtmlFile.UrlRoot mean??

Hi,
I have got the same problem as samir.
Samir did you or anyone else found the meaning of _HtmlFile.UrlRoot?

Hi guys,
I have a regular expression proble, I have a huge html text and I want to convert all a hreftext/a to text where the link does not start with http://
In short I want only external links in my document and want to replace others with their respective texts.

Please please help

http://www.salsasetc.com/graphics/H-175A%20large.jpg

In reference to:
http://www.codinghorror.com/blog/archives/001016.html

it’s only about three years too late, but KMF, here’s an awesome sql formatter:

http://www.sqlinform.com/

maybe others will find it useful even though this post is quite old.

It looks like that link might be here now: .NET Regular Expressions: how to use RegexOptions.IgnorePatternWhitspace [Ryan Byington] | Microsoft Learn

2 Likes