RegExLib.com - The first Regular Expression Library on the Web!

Please support RegExLib Sponsors

Sponsors

Regular Expression Details

Title Test Find Pattern Title
Expression
&\#x0*(0|1|2|3|4|5|6|7|8|B|C|E|F|10|11|12|13|14|15|16|17|18|19|1A|1B|1C|1D|1E|1F);
Description
Can be used to match on (and strip out) low-order non-printable ASCII characters (ASCII 0-31) from string data prior to adding to an XML document. Useful when using parsers like Microsoft's MSXML3 that strictly enforce W3C specification on allowable characters. Does not match on ASCII 9 (horiz tab), 10 (carriage return), 13 (line feed).
Matches
 | 
Non-Matches
  | �
Author Rating: Not yet rated. Matt Skone
Source
Your Rating
Bad Good

Enter New Comment

Title

Name

Comment

Spammers suck - we apologize. Please enter the text shown below to enable your comment (not case sensitive - try as many times as you need to if the first ones are too hard):

Existing User Comments

Title: Alternative?
Name: J'son
Date: 9/29/2005 2:56:48 PM
Comment:
I'm using .NET and in order to get my XML text into the x0000 format you describe I had to encode the xml string using XmlConvert.EncodeName(rawXMLData) - this converts undesirables in an xml string like "jason smith" into "jason_x0020_smith" - and then I'm able to use your pattern (with a couple mods), remove the offending characters and then decode back to normal. I've found that I can skip the encode, decode process altogether by using the \u hex switch, so I've modifed your pattern as follows: \u0000|\u0001|\u0002|\u0003|\u0004|\u0005|\u0006|\u0007|\u0008|\u000B|\u000C|\u000E|\u000F|\u0010|\u0011|\u0012|\u0013|\u0014|\u0015|\u0016|\u0017|\u0018|\u0019|\u001A|\u001B|\u001C|\u001D|\u001E|\u001F Works like a charm so far, but let me know if you see any holes - I'm new to hex, unicode, et. :P Thanx! -- J'son


Copyright © 2001-2025, RegexAdvice.com | ASP.NET Tutorials