The Visual Studio IDE and Regular Expressions

The Visual Studio IDE supports searching and replacing with regular expressions, right? Sure it does. It's right there in grey and black in the find and replace dialog. Just tick the "use Regular expressions" checkbox and we're off to the races.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/07/the-visual-studio-ide-and-regular-expressions.html

I use regular expressions heavily in Visual Studio. Perhaps I’m missing the boat on some other interactive regular expression searcher, but I don’t know what I’d do without them. I work on an older project with at least a couple hundred thousand lines of code and I need to be able to understand and change any of it. Searching is essential for understanding the nature of such systems that are too large to keep entirely in your head.

Also, iteratively doing regular expression search and replace is really handy when translating code from one language to another or generating wrapper/adapter code for largish API’s.

The reason you have the weird search results is that the search starts from the cursor, and your regular expression matches since the first character after the cursor is a letter, etc. Use the beginning of word character or prefix your search with a non-identifier character: [^a-zA-Z][a-zA-Z]+

I’m always surprised when I come across a regex syntax that doesn’t include \ and \ for matching the beginning and end of words. The vast majority of my regex writing has been for grep, sed and vi, all of which understand those constructs. It sure would be nice if tool writers could just pick one syntax and stick with it, though.

Er, that should be backslash-lt and backslash-gt for begin and end. Caught by the html eating software, I guess.

/searching with regular expressions is such an extreme edge condition for most developers/

What?!? I guess if the tools don’t support it well then that might be the case.

But Vim’s regex search and search/replace, especially combined with really easy macro recording/playback, is so powerful and easy to use (once you’ve got an idea of the syntax), I find myself using it fairly often. When I’m forced to use VS for work, there are times when the lack of a good regex search really jumps out at me.

You must not be familiar with grep to think that and are funny. However, in grep, they are preceeded by \

So that’s why I keep getting weird results with the regex find. Geez, I thought I was going crazy.

You might say that searching with regular expressions is such an extreme edge condition for most developers

You guys have to realize that if you’re reading this blog (and by this blog, I mean any blog), you’re already way outside the pool of “most developers”. And in a good way, but remember, you’re not always representative of the average developer :wink:

Reading this post and the comments reminds me of the fact that I wish Visual Studio (and specifically the C# and VB.NET code editors, which I understand are not really part of Visual Studio proper, but rather more like plugins) were more like a good text editor; instead, I find it sometimes gets in my way because it’s trying to be a code editor when I just want it to be a text editor. It would be cool if there was a way to switch out of “smart code editor” mode into “get out of my way and just be a text editor” mode.

Because VS is quite imperfect as a general purpose text editor, I find myself using my favorite text editor, UltraEdit, in combination with Visual Studio quite often. One of my favorite UltraEdit features is the regular expression search-and-replace feature (which also works across multiple files), though I couldn’t say what “standard” it’s uing for regex syntax; like VS, it’s a bit of a hybrid of “standard” perl-type conventions and proprietary UltraEdit conventions, like ^p for an end-of-line character and ^t for tab. UltraEdit has over the years added lots of features like syntax highlighting, code completion, etc. but they’ve never in my opinion added any of these features in a way that interferes with the program’s ability to just be a good “dumb” text editor.

One of Visual Studio getting in the way with trying to be too damn “smart” is when I want to paste in some text that I’ve copied and pasted from a web browser. It insists on pasting it in as HTML! It never fails that I forget about this annoyance, so I have to hit Ctrl-Z to undo, then switch over to UltraEdit, paste my text there, copy it again as plain text, and then I can paste it into the VS editor.

Thanks for the good post, per usual, Jeff,
Dan

It’s always seemed dumb to me that the FR dialog didn’t have an option to use the dotnet RegEx classes!

It looks like the ide can handle extra ones - judging by the drop down, so why didn’t ms provided one based on dnet apps as most dnet apps would are written in the IDE.

In the end I had to use Expresso( I tried regulator but it would even run on my machine, somthing to do with have DN2 installed )

It could be worse. Imagine if they’d chosen POSIX regexes, or old VB’s Like operator (shudder).

Beginning of word: \W\w
Vice versa for end.

Here I’m going out on a limb and trusting the internet that .net syntax is at least pcre-compatible.

But Vim’s regex search and search/replace, especially combined with really easy macro recording/playback, is so powerful and easy to use (once you’ve got an idea of the syntax), I find myself using it fairly often. When I’m forced to use VS for work, there are times when the lack of a good regex search really jumps out at me.

Vim is free. Why not use it at work when you need it?

This is the reason I always keep GNU Emacs handy on any machine I do coding on. I always run into cases where the power of a really good editor is needed. Vim, Emacs and the like have decades of experience behind them, by people who will get it right if it isn’t already.

It appears that VS.NET’s FR was built by people who have time to read blogs, but not to use the very code library that they are asking everyone else to use.

Not sure I’d agree with “…searching with regular expressions is such an extreme edge condition…” either

Indeed. One thing that’s true though is that searching through my CODE with regular expressions is an edge condition (for me). However, I’m always looking at other documents, output from a server, etc. that I regularly need to either format really quick using regex or just grep through. So yeah, code searches are rare for me, but regular expression searches in general are an every day occurrence.

And yes, I’ve been bitten by VS’s funky regex language as well.

It’s pretty common for \b to indicate “word boundary” when outside of a character class/set, so you can get something at the beginning or end of a word by putting a \b in the right spot.

you can get something at the beginning or end of a word by putting a \b in the right spot

Right, \b is very handy, but not quite the same thing as the explicit “beginning of word” or “end of word” characters. I did a search on these characters and I got hits on egrep and emacs. So I guess that does exist in some flavors of regex.

I’m not sure why you are griping here. First, there really isn’t a “standard” regex syntax. Just a whole bunch of bastardized flavors, and arbitrarily picking the javascript flavor.

To me, the additions that Visual Studios makes (with C++ keywords, quoted strings, etc.) are very useful for searching through code. I’d much rather use ‘:i’ for matching an identifier rather than ‘([a-zA-Z_$][a-zA-Z0-9_$]*)’

The use of braces {} to tag patterns rather than parenthesis () is pretty annoying, I’ll admit.

In the meantime, I highly recommend picking up some freeware application like this one: http://weitz.de/regex-coach/
It’s geared toward Perl regular expressions, but still very useful for debugging complex patterns

is this just another example of MS not sticking to standards?

First, there really isn’t a “standard” regex syntax. Just a whole bunch of bastardized flavors, and arbitrarily picking the javascript flavor

There’s no “standard” C++, or “standard” English, either. So we should just give up and stop trying? I say froozbah* to you!

‘:i’ for matching an identifier

I have no problem with the shortcut additions. It’s the wholesale abandonment of normal regex behavior and conventions that I have a problem with.

  • This is a new word I created. Just because.

is this just another example of MS not sticking to standards?

Not really; it’s a case of them hewing too closely to their old, crazy standard from Visual C++ 2.0. Backwards compatibility kills, particularly when it’s backwards compatibility with… er… nonsensical, obsolete stuff.

And, I suspect, a very low priority for this feature compared to other more mainstream improvements in the IDE.

Still, you could wish someone had been a bit braver about scrapping the old to make way for the new… it pains me to hear that developers at microsoft spent time bugfixing the old, weirdo VC++ regex syntax.

Hey Jeff (and assorted follow-up posters),

I’m the lead program manager for the team that owns editing and the find/replace dialog in Visual Studio. Our team agrees with your post :slight_smile:

It is a very oddball regex syntax, and as best we can tell it comes from Visual C++ 2.0. We did want to add additional support for .NET 2.0-style regular expressions in the Visual Studio 2005 release, but unfortunately due to time pressures it didn’t make the final list of features. We were able to make a number of bug fixes to the existing engine though, to give some improvement over VS 2003.

We do keep this on our list of things we want to fix. Ideally at some point we’ll actually build in a nifty little extensibility point so you can wire up any regex engine you want for searches.

Thanks for the feedback!

Neil Enns
Lead Program Manager
Microsoft Visual Studio