I had been trying to extract the domain from the URL that I receive in the yahoo pipe. Using the following rules in regular expression, we can extract the domain name in the general form of URL.
Drag the ‘Regex’ from the ‘Operator’ and drop it to the workspace and
Add following rules in it
^(http|https)://([\w-]+\.)+[\w-]+(/[\w- \+#@!~./?%&=]*)?$ ——>> extract $2
Then add this rule to remove the trailing dot(.)
(.*)\.$ ——>> extract $1
Posts Tagged ‘Regex’
Regular expression to extract domain from the url
Posted by Manish on April 10, 2009
Posted in Uncategorized | Tagged: Regex | 1 Comment »