Title |
Test
Find
Pattern Title
|
Expression |
\b((?#optional port)(https?|ftp|file)://)?
(?#sub domain)([a-z0-9](?:[-a-z0-9]*[a-z0-9])?\.)+
(?#top domain)(com\b|edu\b|biz\b|gov\b|in(?:t|fo)\b|mil\b|net\b|org\b|[a-z][a-z]\b)
(?#optional port)(:\d+)?
(?#optional path)(/[-a-z0-9_:\@&?=+,.!/~*'%\$]*)*
(?#not ending in)(?<![.,?!])
(?#not enclosed in)(?!((?!(?:<a )).)*?(?:</a>))
(?#or enclosed in)(?!((?!(?:<!--)).)*?(?:-->)) |
Description |
Yet Another URL Search. Useful for capturing URLs in raw text. Ignores URLs in HREF and comments. Turn off whitespacing to test! |
Matches |
http://www.google.com | google.com | http://some-domain.net/very/long/path/123.html |
Non-Matches |
subdomain.NonExistentTopDomain | <a href="http://www.google.com">www.google.com</ |
Author |
Rating:
Not yet rated.
Simon Ferguson
|
Source |
Smorgasbord: some from Friedl, some from here, some from http://www.regular-expressions.info/ |
Your Rating |
|
Title: I got it
Name: Markus Engelhardt
Date: 8/6/2007 3:49:50 AM
Comment:
ok, I got it working, don't know if this is a typo in the
RegEx, but it only started working for me when I changed
the part
(?#not ending in)(?<![.,?!])
to
(?#not ending in)(@<![.,?!])
Then the error stopped, now it's fantastic!
Title: Error in C#
Name: Markus Engelhardt
Date: 8/3/2007 7:48:39 AM
Comment:
Hi there.
After hours of searching I found exactly what I needed with this sample... at least I thought so... but unfortunately, I cannot get it to work in cä "unknown group element" it keeps telling me Can anyone maybe help me how to get the same for use with C#? Thanks a lot in advance, bet regards, Markus