Regular Expressions: Now You Have Two Problems

HATE regular expressions.

HATE HATE HATE.

It drove me nuts when I ran across them and couldn’t figure them out, so I learned how to use them very well for about a year. I wrote some moderately complex ones, some simple, and then I just stopped using them.

My problem wasn’t so much not being able to understand what they did, but whether it was correct or not.
It is very easy to write a regex that looks like it should work but misses on a few things.
Just go to regexlib.com and search for currency, you’ll find 30+ distinct different ways to parse or format US currency.

How easily can you tell the difference between these two?
^\d*.\d{2}$
^\d+(?:.\d{0,2})?$

What about these two?
^$( )\d(.\d{1,2})?$
([^,0-9]\D*)([0-9]|\d,\d*)$

Or God forbid these two?
^$?-?([1-9]{1}[0-9]{0,2}(,\d{3})(.\d{0,2})?|[1-9]{1}\d{0,}(.\d{0,2})?|0(.\d{0,2})?|(.\d{1,2}))$|^-?$?([1-9]{1}\d{0,2}(,\d{3})(.\d{0,2})?|[1-9]{1}\d{0,}(.\d{0,2})?|0(.\d{0,2})?|(.\d{1,2}))$|^($?([1-9]{1}\d{0,2}(,\d{3})*(.\d{0,2})?|[1-9]{1}\d{0,}(.\d{0,2})?|0(.\d{0,2})?|(.\d{1,2})))$

^$([0]|([1-9]\d{1,2})|([1-9]\d{0,1},\d{3,3})|([1-9]\d{2,2},\d{3,3})|([1-9],\d{3,3},\d{3,3}))([.]\d{1,2})?$|^($([0]|([1-9]\d{1,2})|([1-9]\d{0,1},\d{3,3})|([1-9]\d{2,2},\d{3,3})|([1-9],\d{3,3},\d{3,3}))([.]\d{1,2})?)$|^($)?(-)?([0]|([1-9]\d{0,6}))([.]\d{1,2})?$

I’d much rather bank on writing a ParseCurrency function to parse or format the data using standard string manipulation.
That’s way easier to look at in 3 months or 3 years.

There is nothing that can be done with a regex that can’t be done with a function call. The function call may be 10 more lines than a single regex, but will always be 100 times easier to read and debug.

I feel that it goes along with Code Complete’s Self Documenting Code idea. If your code or regex can’t be understood without several lines of comments or a separate tool to parse it then there must be a better way.