Technology Cares

Not Just another weblog

Posts Tagged ‘Regex’

Regular expression to extract domain from the url

Posted by Manish on April 10, 2009

I had been trying to extract the domain from the URL that I receive in the yahoo pipe. Using the following rules in regular expression, we can extract the domain name in the general form of URL.
Drag the ‘Regex’ from the ‘Operator’ and drop it to the workspace and
Add following rules in it
^(http|https)://([\w-]+\.)+[\w-]+(/[\w- \+#@!~./?%&=]*)?$     ——>> extract $2
Then add this rule to remove the trailing dot(.)
(.*)\.$     ——>> extract $1

Regex Operator

Regex Operator


Posted in Uncategorized | Tagged: | 1 Comment »