Detecting URLs with Regexps


Jeff Atwood talks about the problem of detecting URLs in text. The problem, as Jeff points out, is that lots of interesting characters are legal in URLs, including parens. So, writing a regular expression to distinguish between these two URLs is hard (but not impossible):

My website (http://www.example.com) 
http://en.wikipedia.org/wiki/PC_Tools_(Central_Point_Software)

Jeff's solution is pretty comprehensive and cuts the Gordian Knot of enclosing the URL in parens by removing them programatically--a good solution since we're not worried about nesting.