Displaying page
of
pages;
Items to
Title |
Test
Details
Pattern Title
|
Expression |
\[link="(?<link>((.|\n)*?))"\](?<text>((.|\n)*?))\[\/link\] |
Description |
This can be used in conjunction with the replace method to provide pseudo-code support without having to enable HTML. The replacement string (in ASP.NET, use RegExp.Replace(SourceString, RegularExpressionPattern, ReplacementString) is <a href="${link}">${text}</a>. |
Matches |
[link="http://www.yahoo.com"]Yahoo[/link] |
Non-Matches |
[link]http://www.yahoo.com[/link] | [link=http://www.yahoo.com]Yahoo[/link] |
Author |
Rating:
Not yet rated.
Ryan S
|
Title |
Test
Details
Pattern Title
|
Expression |
(("|')[a-z0-9\/\.\?\=\&]*(\.htm|\.asp|\.php|\.jsp)[a-z0-9\/\.\?\=\&]*("|'))|(href=*?[a-z0-9\/\.\?\=\&"']*) |
Description |
Will locate an URL in a webpage.
It'll search in 2 ways - first it will try to locate a href=, and then go to the end of the link. If there is nu href=, it will search for the end of the file instead (.asp, .htm and so on), and then take the data between the "xxxxxx" or 'xxxxxx' |
Matches |
href="produktsida.asp?kategori2=218" | href="NuclearTesting.htm" |
Non-Matches |
U Suck |
Author |
Rating:
Not yet rated.
Henric Rosvall
|
Title |
Test
Details
Pattern Title
|
Expression |
(\[[Ii][Mm][Gg]\])(\S+?)(\[\/[Ii][Mm][Gg]\]) |
Description |
easy when you want to allow your users to post images, but in a controlled way. I used it like this (in php):
$text = preg_replace("/(\[IMG\])(\S+?)(\[\/IMG\])/is", "<a href=\"\\2\" target=\"_blank\"><IMG SRC=\"\\2\" align=\"center\" height=\"100\" border=\"0\"></a>",$text);
so whenever they use
[img]http://www.foo.com/bleh.jpg[/img]
it will be converted to
<a href="http://www.foo.com/bleh.jpg" target="_blank"><IMG SRC="http://www.foo.com/bleh.jpg" align="center" height="100" border="0"></a>
so you get a 100 pixels high picture, and when they click on it it opens in a new window...
(to prevent users from posting huge pictures and stuff) |
Matches |
[IMG]http://bleh.jpg[/IMG] | [ImG]bleh[/imG] | [img]ftp://login: [email protected][/img] |
Non-Matches |
<img src="bleh.jpg"> |
Author |
Rating:
marnik vander elst
|
Title |
Test
Details
Pattern Title
|
Expression |
(mailto\:|(news|(ht|f)tp(s?))\://)(([^[:space:]]+)|([^[:space:]]+)( #([^#]+)#)?) |
Description |
this is a very little regex for use within a content management software. links within textfields has not to be written in html. the editor of the cms is instructed to use it like this: 1. mention spaces in front and behind the url 2. start url with http://, mailto://, ftp:// ... 3. use optional linktext within #linktext# (separated with single space) 4. if there is no linktext the url/email will show up as linktext 5. avoid url with spaces in filename (use %20 urldecode) replace pattern (space in front): <a href="\\1\\3\\4" target="_blank">\\3\\6</a> |
Matches |
http://www.domain.com | http://www.domain.com/index%20page.htm #linktext# | mailto://user@domai |
Non-Matches |
<a href="http://www.domain.com">real html link</a> | http://www.without_space_ |
Author |
Rating:
Not yet rated.
Martin Schwedes
|
Title |
Test
Details
Pattern Title
|
Expression |
^[A-Za-zÀ-ÖØ-öø-ÿ '\-\.]{1,22}$ |
Description |
Should match just about any real name, either first
name or last name -- even Jill St. John.
Can't think of a name that has more than 22 characters.
My home page:
<A HREF="http://www.US-Webmasters.com/best-start-page/">http://www.US-Webmasters.com/best-start-page/</A>
|
Matches |
Jill St. John | Jørnç | Mc O'Donald-Öztürk |
Non-Matches |
abc123 | Nobody! | @#$%^& |
Author |
Rating:
Not yet rated.
W. D.
|
Title |
Test
Details
Pattern Title
|
Expression |
href=[\"\'](http:\/\/|\.\/|\/)?\w+(\.\w+)*(\/\w+(\.\w+)?)*(\/|\?\w*=\w*(&\w*=\w*)*)?[\"\'] |
Description |
I wrote up this regular expression to fetch the href attribute found in <a> tags as well as a few other HTML tags. |
Matches |
href="www.yahoo.com" | href="http://localhost/blah/" | href="eek" |
Non-Matches |
href="" | href=eek | href="bad example" |
Author |
Rating:
Andrew Lee
|
Title |
Test
Details
Pattern Title
|
Expression |
(\s|\n|^)(\w+://[^\s\n]+) |
Description |
will match free floating valid protocol + urls in text ... will not touch the ones wrapped in a tag, so that you can auto-link the ones that aren't :) couple of things to know :
1. if the url is next to a tag this won't work (eg : <br>http://www.acme.com), the url must either start with a \s, \n or any character other than >.
2. the pattern will match the preceding \s and \n too, so when you replace put them back in place $1 will either be \s or \n, $2 will be the exact match
vb usage :
set re = New RegExp
re.Pattern ="(\s|\n|^)(\w+://[^\s\n]+)"
strResult = re.Replace(strText, "$1<a href='$2' target='_new'>$2</a>") |
Matches |
http://www.acme.com | ftp://ftp.acme.com/hede | gopher://asdfasd.asdfasdf |
Non-Matches |
<a href="http://acme.com">http://www.acme.com</a> | <br>http://www.acme. |
Author |
Rating:
ic onur
|
Title |
Test
Details
email address (RFC 2822 mailbox)
|
Expression |
^((?>[a-zA-Z\d!#$%&'*+\-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?((?!\.)(?>\.?[a-zA-Z\d!#$%&'*+\-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")@(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}|\[(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)\])(?(angle)>)$ |
Description |
This accepts RFC 2822 email addresses in the form:<br>
[email protected] OR<br>
Blah < [email protected]><br>
<br>
RFC 2822 email 'mailbox':<br>
mailbox = name-addr | addr-spec<br>
name-addr = [display-name] "<" addr-spec ">"<br>
addr-spec = local-part "@" domain<br>
domain = rfc2821domain | rfc2821domain-literal<br>
<br>
local-part conforms to RFC 2822.<br>
<br>
domain is either:<br>
An rfc 2821 domain (EXCEPT that the final sub-domain must consist of 2 or more letters only).<br>
OR<br>
An rfc 2821 address-literal.<br>
(Note, no attempt is made to fully validate an IPv6 address-literal.)<br>
<br>
Notes:<br>
This pattern uses (.NET/Perl only?) features named group "(?<name>)" and alternation/IF (?(name)).<br>
<br>
See <a href="http://regexadvice.com/forums/permalink/26742/26742/ShowThread.aspx#26742">this regexadvice.com thread</a> for more info, including a version that does not use .NET features.<br>
<br>
RFC 2822 (and 822) do allow embedded comments, whitespace, and newlines within *some* parts of an email address, but this pattern above DOES NOT.<br>
<br>
RFC 2822 (and 822) allow the domain to be a simple domain with NO ".", but this pattern requires a compound domain at least one "." in the domain name, as per RFC 2821 (4.1.2).<br>
<br>
RFC 2822 allows/disallows certain whitespace characters in parts of an email address, such as TAB, CR, LF BUT the pattern above does NOT test for these, and assumes that they are not present in the string (on the basis that these characters are hard to enter into an edit box). |
Matches |
|
Non-Matches |
|
Author |
Rating:
Mark Cranness
|
Title |
Test
Details
Pattern Title
|
Expression |
href[ ]*=[ ]*('|\")([^\"'])*('|\") |
Description |
the regex's on this site for pulling links off a page always seemed to be faulty, or at least never worked with PHP, so i made this one. simple, as i'm an amateur with regex's, but stumbled thru it and this one actually works. tested with PHP function: preg_match_all("/href[ ]*=[ ]*('|\")([^\"'])*('|\")/",$string,$matches) |
Matches |
href="index.php" | href = 'http://www.dailymedication.com' | href = "irc://irc.junk |
Non-Matches |
href=http://www.dailymedication.com |
Author |
Rating:
Jason Paschal
|
Title |
Test
Details
Pattern Title
|
Expression |
href[\s]*=[\s]*"[^\n"]*" |
Description |
A very short pattern for extracting hrefs from HTML, does not validate they are within a tag |
Matches |
href ="http://www.theregister.com/" | href="http://theregister.co.uk" | hre |
Non-Matches |
href=http://theregister.co.uk |
Author |
Rating:
Not yet rated.
Tony Hawe
|
Title |
Test
Details
Pattern Title
|
Expression |
<\s*a\s[^>]*\bhref\s*=\s*
('(?<url>[^']*)'|""(?<url>[^""]*)""|(?<url>\S*))[^>]*>
(?<body>(.|\s)*?)<\s*/a\s*> |
Description |
Suitable for extraction of all hyperlinks in the format:
<a ... href="..." ...> some text </a>
from a text document. Separates in groups the components of the links (url and body). |
Matches |
<a href="javascript:'window.close()'">close the window</a> | <a target=&quo |
Non-Matches |
<aa href="test.htm">test</a> | < a href hr = 'http://www.nakov.com'>...& |
Author |
Rating:
Svetlin Nakov
|
Title |
Test
Details
Pattern Title
|
Expression |
<a[\s]+[^>]*?href[\s]?=[\s\"\']+(.*?)[\"\']+.*?>([^<]+|.*?)?<\/a> |
Description |
This regex will extract the link and the link title for every a href in HTML source. Useful for crawling sites.
Note that this pattern will also allow for links that are spread over multiple lines. |
Matches |
<a href='http://www.regexlib.com'>Text</a> | <a href="...">Text</a> |
Non-Matches |
all other html tags |
Author |
Rating:
Not yet rated.
Jacek Sompel
|
Title |
Test
Details
Pattern Title
|
Expression |
href=[\"\']?((?:[^>]|[^\s]|[^"]|[^'])+)[\"\']? |
Description |
This will match just about everything after href=
Its good if you just need a list of all the href= values |
Matches |
href="http://www.google.com/tsunami_relief.html" | href=/preferences?hl=en | href="ht |
Non-Matches |
src=blah blah |
Author |
Rating:
Not yet rated.
Chris Richards
|
Title |
Test
Details
Pattern Title
|
Expression |
(?<HTML><a[^>]*href\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>(?<Title>[^<]+|.*?)?</a\s*>) |
Description |
Powerful href extractor for HTML Element A.
Groups extracted result separately that you can easily use HTML Element, URI or its title.
These may be useful to:
(?<HTML><area[^>]*href\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>)
(?<HTML><form[^>]*action\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>)
(?<HTML><frame[^>]*scr\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>)
(?<HTML><iframe[^>]*scr\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>)
(?<HTML><link[^>]*href\s*=\s*[\"\']?(?<HRef>[^"'>\s]*)[\"\']?[^>]*>) |
Matches |
<a href='http://www.regexlib.com'>Text</a> | <a href="...'>Text</a> | & |
Non-Matches |
all other html tags |
Author |
Rating:
Not yet rated.
Aivar Holyfield
|
Title |
Test
Details
Pattern Title
|
Expression |
(((file|gopher|news|nntp|telnet|http|ftp|https|ftps|sftp)://)|(www\.))+(([a-zA-Z0-9\._-]+\.[a-zA-Z]{2,6})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9\&%_\./-~-]*)? |
Description |
You can use this regular expression in your PHP scripts to convert entered URL in text to URL link. Example:
$text=ereg_replace("(((file|gopher|news|nntp|telnet|http|ftp|https|ftps|sftp)://)|(www\.))+(([a-zA-Z0-9\._-]+\.[a-zA-Z]{2,6})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9\&%_\./-~-]*)?","<a href=\"./redir.php?url=\\0\" target=\"_blank\">\\0</a>",$text); |
Matches |
http://diskusneforum.sk | www.diskusneforum.sk | ftp://123.123.123.123/ |
Non-Matches |
diskusneforum.sk |
Author |
Rating:
Martin Ille
|
Title |
Test
Details
Pattern Title
|
Expression |
<a[a-zA-Z0-9 ="'.:;?]*(name=){1}[a-zA-Z0-9 ="'.:;?]*\s*((/>)|(>[a-zA-Z0-9 ="'<>.:;?]*</a>)) |
Description |
This expression matches only valid html anchors. Those are anchors with an attribute name=. Such anchor can be closed either with </a> or with />.
If someone can help - one thing still missing is not matching html tags with parameter href, becazse such should be considered as non valid anchors. |
Matches |
<a name="anchorName">Anchor</a> | <a name=anchorName /> |
Non-Matches |
<a href="somewhere"> | <a href> | <a name /> |
Author |
Rating:
Aleš Potocnik
|
Title |
Test
Details
Pattern Title
|
Expression |
<a[a-zA-Z0-9 ="'.?_/]*(href\s*=\s*){1}[a-zA-Z0-9 ="'.?_/]*\s*((/>)|(>[a-zA-Z0-9 ="'<>.?_/]*</a>)) |
Description |
An expression that matches all XHTML valid hrefs (links). It even alows spaces like href = "href...", dough this is not quite XHTML valid. It finds only hrefs but not for instance anchors. If you need to find only anchors, replace "href" within expression with "name" and thats it. |
Matches |
<a href="www.google.com">Google</a> | <a href=www.google.com /> | <a |
Non-Matches |
<a name="anchor">Anchor</a> | <img src="image.gif"> |
Author |
Rating:
Aleš Potocnik
|
Title |
Test
Details
Pattern Title
|
Expression |
\b((?#optional port)(https?|ftp|file)://)?
(?#sub domain)([a-z0-9](?:[-a-z0-9]*[a-z0-9])?\.)+
(?#top domain)(com\b|edu\b|biz\b|gov\b|in(?:t|fo)\b|mil\b|net\b|org\b|[a-z][a-z]\b)
(?#optional port)(:\d+)?
(?#optional path)(/[-a-z0-9_:\@&?=+,.!/~*'%\$]*)*
(?#not ending in)(?<![.,?!])
(?#not enclosed in)(?!((?!(?:<a )).)*?(?:</a>))
(?#or enclosed in)(?!((?!(?:<!--)).)*?(?:-->)) |
Description |
Yet Another URL Search. Useful for capturing URLs in raw text. Ignores URLs in HREF and comments. Turn off whitespacing to test! |
Matches |
http://www.google.com | google.com | http://some-domain.net/very/long/path/123.html |
Non-Matches |
subdomain.NonExistentTopDomain | <a href="http://www.google.com">www.google.com</ |
Author |
Rating:
Not yet rated.
Simon Ferguson
|
Title |
Test
Details
Pattern Title
|
Expression |
<a.*?href=(.*?)(?((?:\s.*?)>.*?</a>)(?:(?:\s.*?)>(.*?)</a>)|(?:>(.*?)</a>)) |
Description |
this Expression use conditional expression to eveluate parameter after "HREF" and executes the yes/no part of the expression. this expression finds <A> TAG and returns value of "HREF" and the value hold in between <a></a> tags. the expression returns maximum 3 sub matches. the first match returns the "HREF" tag value and rest of two holds the value of the tag alternativly. so after executing the expression you need to itarate through all the submatches and need to find out non NULL tags to get value.
the output of the above matching expamples would be like this :
1: "/url?sa=p&pref=ig&pval=2&q=http://www.google.co.in/ig%3Fhl%3Den"
2:[Personalized Home]
3:[] or NULL
the output of the second matching example would be like this.
1:/advanced_search?hl=en
2:[] or NULL
3:[Advanced Search] |
Matches |
<a href="/url?sa=p&pref=ig&pval=2&q=http://www.google.co.in/ig%3Fhl%3Den" o |
Non-Matches |
none |
Author |
Rating:
himraj love
|
Title |
Test
Details
Pattern to find Anchor Tag in a web page
|
Expression |
<a[\s]+[^>]*?href[\s]?=[\s\"\']*(.*?)[\"\']*.*?>([^<]+|.*?)?<\/a> |
Description |
This pattern is a slight modification in pattern submitted by Jacek Sompel. Using this tag one can also match anchor tags not having ' (single quote) or " (double quote) in href. This is useful for web crawler for crawling all links in a web page. |
Matches |
<a href='http://www.regexlib.com'>Text</a> | <a href="...">Text</a> | <a href=http://www.regexlib.com>Text</a> |
Non-Matches |
all other html tags |
Author |
Rating:
Kuleen Upadhyaya
|
Displaying page
of
pages;
Items to