Displaying page
of
pages;
Items to
| Title |
Test
Details
Pattern Title
|
| Expression |
<[a-zA-Z][^>]*\son\w+=(\w+|'[^']*'|"[^"]*")[^>]*> |
| Description |
Find HTML tags that have javascript events attached to them. |
| Matches |
<IMG onmouseover="window.close()"> |
| Non-Matches |
<IMG src="star.gif"> |
| Author |
Rating:
Not yet rated.
Lewis Moten
|
| Title |
Test
Details
Pattern Title
|
| Expression |
^<a\s+href\s*=\s*"http:\/\/([^"]*)"([^>]*)>(.*?(?=<\/a>))<\/a>$ |
| Description |
Regexp to find all external links in a HTML string.
Can easily be modified to handle all/other links/protocols (like file/https/ftp).
Uses lookahead assertions and non-greedy modifier to check for the end </a> but still allow html tags inbetween start and end A tag.
Takes into account that there could be linebreaks and other nasty whitespace chars in the middle of the tag.
I am using it to find all external links in embedded HTML code and change 1.the target of the link 2.insert a "Leaving Site" logo to illustrate you are leaving site. |
| Matches |
<a href="http://www.mysite.com">my external link</a> | <a href="http:/ |
| Non-Matches |
<a href="myinternalpage.html">my internal link</a> |
| Author |
Rating:
Not yet rated.
Anders Rask
|
| Title |
Test
Details
Pattern Title
|
| Expression |
>(?:(?<t>[^<]*)) |
| Description |
Detects HTML tags open and/or closed with and without whitespace or characters in between. Good for stripping all tags from a string. |
| Matches |
<b> | </b> | <p><b>some text</b></p> |
| Non-Matches |
< |
| Author |
Rating:
Not yet rated.
Jonathan Crossland
|
| Title |
Test
Details
Pattern Title
|
| Expression |
<[a-zA-Z]+(\s+[a-zA-Z]+\s*=\s*("([^"]*)"|'([^']*)'))*\s*/> |
| Description |
Matches a valid "empty" tag (has trailing slash). Note, if you run it against a string such as <img src="test.gif" alt="<hr />"> it will indeed return a match. But, the match is not at character 1 like you'd suspect, but rather it's matching the internal <hr />. If you look at the source of this tag (http://concepts.waetech.com/unclosed_tags/) you'll find a whoe suite of regex's for matching html tags. Using them you could feasibly step though a document and avoid this mismatch as the outer tag would match *in totality* and you'd completely skip this inner match.
|
| Matches |
<img src="test.gif"/> |
| Non-Matches |
<img src="test.gif"> | <img src="test.gif"a/> |
| Author |
Rating:
Not yet rated.
Joshua Olson
|
| Title |
Test
Details
Pattern Title
|
| Expression |
href=[\"\'](http:\/\/|\.\/|\/)?\w+(\.\w+)*(\/\w+(\.\w+)?)*(\/|\?\w*=\w*(&\w*=\w*)*)?[\"\'] |
| Description |
I wrote up this regular expression to fetch the href attribute found in <a> tags as well as a few other HTML tags. |
| Matches |
href="www.yahoo.com" | href="http://localhost/blah/" | href="eek" |
| Non-Matches |
href="" | href=eek | href="bad example" |
| Author |
Rating:
Andrew Lee
|
| Title |
Test
Details
Pattern Title
|
| Expression |
<a[a-zA-Z0-9 ="'.:;?]*(name=){1}[a-zA-Z0-9 ="'.:;?]*\s*((/>)|(>[a-zA-Z0-9 ="'<>.:;?]*</a>)) |
| Description |
This expression matches only valid html anchors. Those are anchors with an attribute name=. Such anchor can be closed either with </a> or with />.
If someone can help - one thing still missing is not matching html tags with parameter href, becazse such should be considered as non valid anchors. |
| Matches |
<a name="anchorName">Anchor</a> | <a name=anchorName /> |
| Non-Matches |
<a href="somewhere"> | <a href> | <a name /> |
| Author |
Rating:
Aleš Potocnik
|
| Title |
Test
Details
Pattern Title
|
| Expression |
(?<TAG>\s*<(?<TAG_NAME>\w*)\s+(?<PARAMETERS>(?<PARAMETER>(?<PARAMETER_NAME>\w*)(=["']?)(?<VALUE>[\w\W\d]*?)["']?)+)\s*/?>) |
| Description |
Parse html tags to extract tag names and parameters with parameter name/value pairs. |
| Matches |
<td valign="top" align="left" colspan="2"> |
| Non-Matches |
<!--dynamic_content GlobalID=49113--> |
| Author |
Rating:
Maxim Paukov
|
| Title |
Test
Details
Pattern Title
|
| Expression |
<a[a-zA-Z0-9 ="'.:;?]*(href=[\"\'](http:\/\/|\.\/|\/)?\w+(\.\w+)*(\/\w+(\.\w+)?)*(\/|\?\w*=\w*(&\w*=\w*)*)?[\"\'])*(>[a-zA-Z0-9 ="'<>.:;?]*</a>) |
| Description |
i've merged two regular expression those i found on this site. thanks to the owners "Aleš Potocnik and
Andrew Lee ". i used their expression to make mine. this expression finds the URL/Hyperlink with the HTML tags. |
| Matches |
<a href="http://www.google.co.in/hi">Hindi</a> |
| Non-Matches |
href="http://www.google.co.in/hi" |
| Author |
Rating:
Not yet rated.
himraj love
|
| Title |
Test
Details
Self Close Valid HTML Tags
|
| Expression |
<(?<!\\?|\\/)([^>]*)>\\r*\\n<\\/(?=br|hr|img|input|link|param)[^>]*>
|
| Description |
This pattern searches for tags in HTML that should be self closing but currently aren't and self closes them. This is useful if you are doing some HTML parsing. |
| Matches |
<br> CRLF </br> etc. |
| Non-Matches |
<textarea> CRLF </textarea> etc. |
| Author |
Rating:
Not yet rated.
Iain Dooley
|
| Title |
Test
Details
replace html tags with valid xhtml
|
| Expression |
(<input )(.*?)(>) |
| Description |
Finds all <input attrib1="value1" attrib2="value2" ... > tags. You can make it end with "/>" for xhtml compatibility replacing with the expression "<input $2 />". You can repeat it with other tags like <img /> or <br / > |
| Matches |
<input attrib1="value1" attrib2="value2" > |
| Non-Matches |
any other tag |
| Author |
Rating:
Not yet rated.
Mauricio Venanzoni
|
| Title |
Test
Details
Find <h1> Tags
|
| Expression |
<h([1-6])>([^<]*)</h([1-6])> |
| Description |
This regex find valid <h1-6> html tags |
| Matches |
<h2>test2</h2><h3>test3</h3> |
| Non-Matches |
<h>test1</h> |
| Author |
Rating:
Syrprize
|
| Title |
Test
Details
List HTML tags
|
| Expression |
<(?![!/]?[ABIU][>\s])[^>]*> |
| Description |
Used to return all the html tags and closing tags in a section of html. Can be used to replace all the tags with nothing or itterate through them. |
| Matches |
<u><b>hello</b></u> |
| Non-Matches |
hello |
| Author |
Rating:
Not yet rated.
Richard Brisley
|
| Title |
Test
Details
Remove (X)HTML like tags
|
| Expression |
<\s*?[^>]+\s*?> |
| Description |
This simple pattern is useful for removing all HTML tags with or without atributes. It has no removing white spaces |
| Matches |
< html > | < div style="title_1" class='number'> | < div style="title_1" class='number' > | < img src="img.gif" / > |
| Non-Matches |
Plain text |
| Author |
Rating:
Shreeve
|
| Title |
Test
Details
Cleaning HTML
|
| Expression |
<\/{0,1}(?!\/|b>|i>|p>|a\s|a>|br|em>|ol|li|strong>)[^>]*> |
| Description |
following a bit of work this morning trying to get something to strip out arbitrary html but leave 'known' tags in place, we have come up with the following which may be useful. This uses the 'negative lookahead' construct using '?!' It looks for an angle bracket and perhaps a backslash, as long as it is *not* followed by one of the terms in the ?! section. The brackets in this section do not return a value, they are part of the construct. This regexp can therefore be used to replace all unknown tags with blanks. Obviously you can add other 'good' html tags to the list. |
| Matches |
<table>...</table> |
| Non-Matches |
blah blah blah. |
| Author |
Rating:
Not yet rated.
Gordon Buxton
|
| Title |
Test
Details
Cleaning HTML
|
| Expression |
<\/{0,1}(?!\/|b>|i>|p>|a\s|a>|br|em>|ol|li|strong>)[^>]*> |
| Description |
following a bit of work this morning trying to get something to strip out arbitrary html but leave 'known' tags in place, we have come up with the following which may be useful. This uses the 'negative lookahead' construct using '?!' It looks for an angle bracket and perhaps a backslash, as long as it is *not* followed by one of the terms in the ?! section. The brackets in this section do not return a value, they are part of the construct. This regexp can therefore be used to replace all unknown tags with blanks. Obviously you can add other 'good' html tags to the list. |
| Matches |
<table>...</table> |
| Non-Matches |
blah blah blah. |
| Author |
Rating:
Not yet rated.
Gordon Buxton
|
| Title |
Test
Details
Match Valid HTML Tags As Browser
|
| Expression |
<(/)?(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|big|blockquote|body|br|button|caption|center|cite|code|col|colgroup|dd|del|dir|div|dfn|dl|dt|em|fieldset|font|form|frame|frameset|h[1-6]|head|hr|html|i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|map|menu|meta|noframes|noscript|object|ol|optgroup|option|p|param|pre|q|s|samp|script|select|small|span|strike|strong|style|sub|sup|table|tbody|td|textarea|tfoot|th|thead|title|tr|tt|u|ul|var|xmp){1}(\s(\"[^\"]*\"*|[^>])*)*> |
| Description |
This should match all valid HTML 4.01 tags as a browser would recognize. If you miss a 2nd " it will continue until it finds one to pair with, so if it doesn't find one, it continues until the end. This is how most browsers work I believe. It does have a few flaws, it will match </img> and </input> which is weird, but perhaps I'll fix that eventually. |
| Matches |
</a> <h2 > </h2 asfsdf> <a href="abc>>123"> |
| Non-Matches |
< /a> </h 2 asfsdf> <ahref="abc123"> |
| Author |
Rating:
John Smith
|
| Title |
Test
Details
Match Valid HTML Tags
|
| Expression |
</?(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|big|blockquote|body|br|button|caption|center|cite|code|col|colgroup|dd|del|dir|div|dfn|dl|dt|em|fieldset|font|form|frame|frameset|h[1-6]|head|hr|html|i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|map|menu|meta|noframes|noscript|object|ol|optgroup|option|p|param|pre|q|s|samp|script|select|small|span|strike|strong|style|sub|sup|table|tbody|td|textarea|tfoot|th|thead|title|tr|tt|u|ul|var|xmp)\b((\"[^\"]*\"|\'[^\']*\')*|[^\"\'>])*> |
| Description |
This is very similar to my other expression, except it only matches tags that a browser would read, so if you have an extra " in the tag, it will not count it, and move onto the next possibility. |
| Matches |
</a> <h2 > </a asdfs> </h2 asfsdf> <a href="abc>>123"> |
| Non-Matches |
< /a> </h 2 asfsdf> <ahref="abc"123"> |
| Author |
Rating:
John Smith
|
| Title |
Test
Details
HTML Tags
|
| Expression |
</?[a-z][a-z0-9]*[^<>]*> |
| Description |
Mathes any HTML tag with any parameters. Very useful to clean HTML of a text. |
| Matches |
<tr style="height: 1px; background-color: #ffffff"> <td colspan="4"> </br> |
| Non-Matches |
Any other text outside a tag symbols < > |
| Author |
Rating:
Not yet rated.
Roberto Santana
|
| Title |
Test
Details
HTML Tags and Comments
|
| Expression |
<!*[^<>]*> |
| Description |
Mathes any HTML tag with any parameters and HTML Comments. Very useful to clean HTML of a text. |
| Matches |
<tr style="height: 1px; background-color: #ffffff"> <td colspan="4"> <!-- comment --> <!DOCTYPE html PUBLIC ... > |
| Non-Matches |
Any other text outside a tag symbols < > |
| Author |
Rating:
Roberto Santana
|
| Title |
Test
Details
Remove all attributes related to event handling from inside HTML tags
|
| Expression |
(\s(\bon[a-zA-Z][a-z]+)\s?\=\s?[\'\"]?(javascript\:)?[\w\(\),\' ]*;?[\'\"]?)+ |
| Description |
No idea whether anyone would ever need this, but I had to work half a day on this pattern, so I decided to share it. :) It was never meant for productive use at all; it was rather to filter out all that annoying event handling stuff to find a bug in my DHTML table-generating script. Give it a try with this string (see details):
<div id="TSelect_TD_value_911" class="TSel" onpaste="" onblur="TSelectClose(this);" onClick="TSelectOpen(this);" style="width:250px; padding:2px;"> |
| Matches |
onPaste onBlur onClick ... ; onblur onclick onpaste ... |
| Non-Matches |
<div id="TSelect_TD_value_911" class="TSel" style="width:250px; padding:2px;"> |
| Author |
Rating:
globalplayer
|
Displaying page
of
pages;
Items to