Displaying page
of
pages;
Items to
Title |
Test
Details
Cleaning HTML
|
Expression |
<\/{0,1}(?!\/|b>|i>|p>|a\s|a>|br|em>|ol|li|strong>)[^>]*> |
Description |
following a bit of work this morning trying to get something to strip out arbitrary html but leave 'known' tags in place, we have come up with the following which may be useful. This uses the 'negative lookahead' construct using '?!' It looks for an angle bracket and perhaps a backslash, as long as it is *not* followed by one of the terms in the ?! section. The brackets in this section do not return a value, they are part of the construct. This regexp can therefore be used to replace all unknown tags with blanks. Obviously you can add other 'good' html tags to the list. |
Matches |
<table>...</table> |
Non-Matches |
blah blah blah. |
Author |
Rating:
Not yet rated.
Gordon Buxton
|
Title |
Test
Details
Cleaning HTML
|
Expression |
<\/{0,1}(?!\/|b>|i>|p>|a\s|a>|br|em>|ol|li|strong>)[^>]*> |
Description |
following a bit of work this morning trying to get something to strip out arbitrary html but leave 'known' tags in place, we have come up with the following which may be useful. This uses the 'negative lookahead' construct using '?!' It looks for an angle bracket and perhaps a backslash, as long as it is *not* followed by one of the terms in the ?! section. The brackets in this section do not return a value, they are part of the construct. This regexp can therefore be used to replace all unknown tags with blanks. Obviously you can add other 'good' html tags to the list. |
Matches |
<table>...</table> |
Non-Matches |
blah blah blah. |
Author |
Rating:
Not yet rated.
Gordon Buxton
|
Title |
Test
Details
Match Valid HTML Tags As Browser
|
Expression |
<(/)?(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|big|blockquote|body|br|button|caption|center|cite|code|col|colgroup|dd|del|dir|div|dfn|dl|dt|em|fieldset|font|form|frame|frameset|h[1-6]|head|hr|html|i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|map|menu|meta|noframes|noscript|object|ol|optgroup|option|p|param|pre|q|s|samp|script|select|small|span|strike|strong|style|sub|sup|table|tbody|td|textarea|tfoot|th|thead|title|tr|tt|u|ul|var|xmp){1}(\s(\"[^\"]*\"*|[^>])*)*> |
Description |
This should match all valid HTML 4.01 tags as a browser would recognize. If you miss a 2nd " it will continue until it finds one to pair with, so if it doesn't find one, it continues until the end. This is how most browsers work I believe. It does have a few flaws, it will match </img> and </input> which is weird, but perhaps I'll fix that eventually. |
Matches |
</a> <h2 > </h2 asfsdf> <a href="abc>>123"> |
Non-Matches |
< /a> </h 2 asfsdf> <ahref="abc123"> |
Author |
Rating:
John Smith
|
Title |
Test
Details
Match Valid HTML Tags
|
Expression |
</?(a|abbr|acronym|address|applet|area|b|base|basefont|bdo|big|blockquote|body|br|button|caption|center|cite|code|col|colgroup|dd|del|dir|div|dfn|dl|dt|em|fieldset|font|form|frame|frameset|h[1-6]|head|hr|html|i|iframe|img|input|ins|isindex|kbd|label|legend|li|link|map|menu|meta|noframes|noscript|object|ol|optgroup|option|p|param|pre|q|s|samp|script|select|small|span|strike|strong|style|sub|sup|table|tbody|td|textarea|tfoot|th|thead|title|tr|tt|u|ul|var|xmp)\b((\"[^\"]*\"|\'[^\']*\')*|[^\"\'>])*> |
Description |
This is very similar to my other expression, except it only matches tags that a browser would read, so if you have an extra " in the tag, it will not count it, and move onto the next possibility. |
Matches |
</a> <h2 > </a asdfs> </h2 asfsdf> <a href="abc>>123"> |
Non-Matches |
< /a> </h 2 asfsdf> <ahref="abc"123"> |
Author |
Rating:
John Smith
|
Title |
Test
Details
HTML Tag Remover
|
Expression |
<\/?(tag1|tag2)[^>]*\/?> |
Description |
This is expression is good if you need to clean up some code (like from using DW Design View or Front Page). Just the "tag1" and "tag2" with the tags you want, and you can add more by putting a | inbetween each tag. |
Matches |
<tag1 name="input" id="input"> | <tag2> | <tag1 src="/home.jpg" /> | </tag1> |
Non-Matches |
<tag1 | tag2> |
Author |
Rating:
Not yet rated.
Kerry Jones
|
Title |
Test
Details
HTML Tag and InnerHTML Remover
|
Expression |
<(tag1|tag2)[^>]*\/?>.*<\/(?:\1)> |
Description |
Removes all the text between two tags. Replace the "tag1" and "tag2" with the tag you want to remove, you can add more by placing additional "|" between each tag. |
Matches |
<tag1>Welcome!</tag1> | <tag1>How <b>are</b> you?</tag1> |
Non-Matches |
<tag1>How <b>are</b> you? | <tag1>How <b>are</b> you?</tag2> |
Author |
Rating:
Not yet rated.
Kerry Jones
|
Title |
Test
Details
While Not Between
|
Expression |
(?<!aaa((?!bbb)[\s\S])*)SomeText |
Description |
This .NET regex will match "SomeText" while not between the words "aaa" and "bbb". A good use of this is to find certain text while you are not inside of a certain HTML tag. |
Matches |
SomeText |
Non-Matches |
aaa SomeText bbb |
Author |
Rating:
Not yet rated.
Timothy Khouri
|
Title |
Test
Details
HTML Tags
|
Expression |
</?[a-z][a-z0-9]*[^<>]*> |
Description |
Mathes any HTML tag with any parameters. Very useful to clean HTML of a text. |
Matches |
<tr style="height: 1px; background-color: #ffffff"> <td colspan="4"> </br> |
Non-Matches |
Any other text outside a tag symbols < > |
Author |
Rating:
Not yet rated.
Roberto Santana
|
Title |
Test
Details
HTML Tags and Comments
|
Expression |
<!*[^<>]*> |
Description |
Mathes any HTML tag with any parameters and HTML Comments. Very useful to clean HTML of a text. |
Matches |
<tr style="height: 1px; background-color: #ffffff"> <td colspan="4"> <!-- comment --> <!DOCTYPE html PUBLIC ... > |
Non-Matches |
Any other text outside a tag symbols < > |
Author |
Rating:
Roberto Santana
|
Title |
Test
Details
Remove all attributes related to event handling from inside HTML tags
|
Expression |
(\s(\bon[a-zA-Z][a-z]+)\s?\=\s?[\'\"]?(javascript\:)?[\w\(\),\' ]*;?[\'\"]?)+ |
Description |
No idea whether anyone would ever need this, but I had to work half a day on this pattern, so I decided to share it. :) It was never meant for productive use at all; it was rather to filter out all that annoying event handling stuff to find a bug in my DHTML table-generating script. Give it a try with this string (see details):
<div id="TSelect_TD_value_911" class="TSel" onpaste="" onblur="TSelectClose(this);" onClick="TSelectOpen(this);" style="width:250px; padding:2px;"> |
Matches |
onPaste onBlur onClick ... ; onblur onclick onpaste ... |
Non-Matches |
<div id="TSelect_TD_value_911" class="TSel" style="width:250px; padding:2px;"> |
Author |
Rating:
globalplayer
|
Title |
Test
Details
Html Tag finder
|
Expression |
<\s*/?\s*\w+(\s*\w+\s*=\s*(['"][^'"]*['"]|[\w#]+))*\s*/?\s*> |
Description |
This pattern can find any HTML tag. It supports attributes whose values may be enclosed between single or double quotes. It aslo support spaces between delimeters. |
Matches |
Color is <font color =#880000 >red < / font > |
Non-Matches |
12 is < 20 and > 6 |
Author |
Rating:
Not yet rated.
Ferdinando Ricchiuti
|
Title |
Test
Details
Extract quoted attributes from HTML tag
|
Expression |
(?<tagname>[^\s]*)="(?<tagvalue>[^"]*)" |
Description |
Quick and dirty extraction of quoted HTML attributes if you begin with just the tag string. Not intended for use in a full HTML document.
|
Matches |
<tag attr1="value1"> | <tag attr1="value1" attr2="value2"> |
Non-Matches |
<tag attr1="value1> | <tag attr1=value1"> | <tag attr1=value1> |
Author |
Rating:
Mitch Baker
|
Title |
Test
Details
Capture HTML Tags
|
Expression |
<font[a-zA-Z0-9_\^\$\.\|\{\[\}\]\(\)\*\+\?\\~`!@#%&-=;:'",/\n\s]*> |
Description |
This expression will capture font tags(or any other HTML tag if you change the word font in the expression) with parameters and stop at the closing font tag bracket. The only keyboard characters it will not find between the starting and ending bracket are additional brackets. So for example if you are looking for image tags it will not find <img src="..." alt=">My Picture<"> due to the nested brackets. When I allowed nested brackets like this the expression did not always return only the tag I was looking for. Sometimes it returned additional tags at the end so I suggest staying away from brackets in alt text and anywhere else you may be allowed to do it. |
Matches |
<font color="#006666">, <font face="arial" style="font-size: 11pt"> |
Non-Matches |
<font, </font>, <font <> |
Author |
Rating:
Not yet rated.
Kurt McEllhenney
|
Title |
Test
Details
Strip HTML tags with exceptions
|
Expression |
<\/*?(?![^>]*?\b(?:a|img)\b)[^>]*?> |
Description |
This regex will match all HTML tags except 'a' tags or 'img' tags. You can edit the list of exclusions as you see fit. I use this regex to strip all HTML tags from source data except anchor tags and image tags. |
Matches |
<script> </html> <anytag> |
Non-Matches |
<a> <img /> </a> |
Author |
Rating:
Charles Forsyth
|
Title |
Test
Details
Strip HTML tags and content between
|
Expression |
<(script|style)[^>]*?>(?:.|\n)*?</\s*\1\s*> |
Description |
This regular expression will match only <script> and <style> tags and all content between them. Use this with regex.replace to strip script blocks and style blocks from HTML source. |
Matches |
<script>test</script>, <style>test<style> |
Non-Matches |
-all other html code is ignored- |
Author |
Rating:
Charles Forsyth
|
Title |
Test
Details
HTML Tag operation - Identification and Extraction
|
Expression |
(\<(.*?)\>)(.*?)(\<\/(.*?)\>) |
Description |
This will identify all the characters in between html tags irrespective of the length of the character or intiger. If scenario emerges to extract content between tags a replacement string can be used: $3 |
Matches |
<td>city</td> <head>ok</head> |
Non-Matches |
content without tags |
Author |
Rating:
Mukundh
|
Title |
Test
Details
Valid HTML code
|
Expression |
^\<(\w){1,}\>(.){0,}([\</]|[\<])(\w){1,}\>$ |
Description |
validates HTML tags |
Matches |
<br><b>fsd</b><br> |
Non-Matches |
<br><<<<>br> |
Author |
Rating:
Marko Maruna
|
Displaying page
of
pages;
Items to