RegExLib.com - The first Regular Expression Library on the Web!

Please support RegExLib Sponsors

Sponsors

Regular Expression Details

Title Test Find Pattern Title
Expression
(?s)( class=\w+(?=([^<]*>)))|(<!--\[if.*?<!\[endif\]-->)|(<!\[if !\w+\]>)|(<!\[endif\]>)|(<o:p>[^<]*</o:p>)|(<span[^>]*>)|(</span>)|(font-family:[^>]*[;'])|(font-size:[^>]*[;'])(?-s)
Description
Word HTML cleanup code. Use this expression to get rid of most of the stuff that Word adds to an HTML document such as: lots of span elements, font-family and font-size style attributes, class attributes, a whole bunch of if-then statements. Use this expression in a regex.replace(originalHtml, regExpr, "").
Matches
<span>
Non-Matches
<table>
Author Rating: The rating for this expression. Peter Donker
Source
Your Rating
Bad Good

Enter New Comment

Title
 
Name
 
Comment
 
Spammers suck - we apologize. Please enter the text shown below to enable your comment (not case sensitive - try as many times as you need to if the first ones are too hard):

Existing User Comments

Title: Nice Pattern
Name: Justin West
Date: 4/20/2005 1:59:26 PM
Comment:
you just saved me Hours


Title: Url field doesn't work?
Name: MikeG
Date: 1/2/2005 8:25:15 PM
Comment:
I guess the Url field doesn't work. Here's the Url for the app... http://fresh.no-ip.org/include/downloads.html


Title: GoodOnYa
Name: MikeG
Date: 1/2/2005 8:22:36 PM
Comment:
Kudos and thanks for the work. I've encorporated part of your code into a c# app I wrote to parse out Word copy & paste snippets for repasting in html form posts (like blogs, forums, etc). Source code and exe is downloadable from the link above. Thanks again and best regards!


Title: Nice pattern
Name: Darren Neimke
Date: 5/31/2004 7:53:02 AM
Comment:
Nice one! For your interest you can also save patterns with embedded whitespace which can help to make them more readable (maybe). For example: (?s) ( class=\w+(?=([^<]*>))) |(<!--\[if.*?<!\[endif\]-->) |(<!\[if !\w+\]>) |(<!\[endif\]>) |(<o:p>[^<]*</o:p>) |(<span[^>]*>) |(</span>) |(font-family:[^>]*[;']) |(font-size:[^>]*[;']) (?-s)


Copyright © 2001-2024, RegexAdvice.com | ASP.NET Tutorials