RegExLib.com - The first Regular Expression Library on the Web!

Please support RegExLib Sponsors

Sponsors

Regular Expression Details

Title Test Find Pattern Title
Expression
^\s*((?:(?:\d+(?:\x20+\w+\.?)+(?:(?:\x20+STREET|ST|DRIVE|DR|AVENUE|AVE|ROAD|RD|LOOP|COURT|CT|CIRCLE|LANE|LN|BOULEVARD|BLVD)\.?)?)|(?:(?:P\.\x20?O\.|P\x20?O)\x20*Box\x20+\d+)|(?:General\x20+Delivery)|(?:C[\\\/]O\x20+(?:\w+\x20*)+))\,?\x20*(?:(?:(?:APT|BLDG|DEPT|FL|HNGR|LOT|PIER|RM|S(?:LIP|PC|T(?:E|OP))|TRLR|UNIT|\x23)\.?\x20*(?:[a-zA-Z0-9\-]+))|(?:BSMT|FRNT|LBBY|LOWR|OFC|PH|REAR|SIDE|UPPR))?)\,?\s+((?:(?:\d+(?:\x20+\w+\.?)+(?:(?:\x20+STREET|ST|DRIVE|DR|AVENUE|AVE|ROAD|RD|LOOP|COURT|CT|CIRCLE|LANE|LN|BOULEVARD|BLVD)\.?)?)|(?:(?:P\.\x20?O\.|P\x20?O)\x20*Box\x20+\d+)|(?:General\x20+Delivery)|(?:C[\\\/]O\x20+(?:\w+\x20*)+))\,?\x20*(?:(?:(?:APT|BLDG|DEPT|FL|HNGR|LOT|PIER|RM|S(?:LIP|PC|T(?:E|OP))|TRLR|UNIT|\x23)\.?\x20*(?:[a-zA-Z0-9\-]+))|(?:BSMT|FRNT|LBBY|LOWR|OFC|PH|REAR|SIDE|UPPR))?)?\,?\s+((?:[A-Za-z]+\x20*)+)\,\s+(A[LKSZRAP]|C[AOT]|D[EC]|F[LM]|G[AU]|HI|I[ADLN]|K[SY]|LA|M[ADEHINOPST]|N[CDEHJMVY]|O[HKR]|P[ARW]|RI|S[CD]|T[NX]|UT|V[AIT]|W[AIVY])\s+(\d+(?:-\d+)?)\s*$
Description
Based on a regular expression from Michael Ash, this captures US street addresses and mailing addresses, single or multi-line (multi-line is more reliable), and breaks them into discrete parts for address line 1 and 2, city, state, and postal code. This expression is not perfect - with the interpreter I am using, some addresses refuse to match correctly. It should however work for most addresses, particularly when lines are delimited with carriage returns, tabs, or some other whitespace line delimiter that is not a space (\x20). Note: For improved compatibility, this expression does not use named groups. **Output** \1 = Address 1, \2 = Address 2, \3 = City, \4 = State, \5 = Postal Code
Matches
P.O. Box 42 Huslia, AK 99746 | C/O John Paul, POBox 456, Motown, CA 96090
Non-Matches
4321 East 40th Apt #3 Anchorage AK 99504 | Stockton, CA 95215
Author Rating: Not yet rated. Ross Hammer
Source Ross Hammer, based on work by Michael Ash
Your Rating
Bad Good

Enter New Comment

Title
 
Name
 
Comment
 
Spammers suck - we apologize. Please enter the text shown below to enable your comment (not case sensitive - try as many times as you need to if the first ones are too hard):

Existing User Comments

Title: Problems recognising Street Suffixes (Re PO Box only?)
Name: Mark
Date: 2/5/2010 8:24:51 AM
Comment:
I guess the list of street suffixes are set to catch the most common forms (those being the suffixes the author could recall?). The US Postal Service publishes an astonishingly long list of street suffixes: http://www.usps.com/ncsc/lookups/usps_abbreviations.html#suffix BTW The only "Pike" I heard was when colleagues in Watham MA refered to the old turnpike out of Boston as "Mass. 'Pike".


Title: Problems recognising Street Suffixes (Re PO Box only?)
Name: Mark
Date: 2/3/2010 10:52:31 AM
Comment:
I guess the list of street suffixes are set to catch the most common forms (those being the suffixes the author could recall?). The US Postal Service publishes an astonishingly long list of street suffixes: http://www.usps.com/ncsc/lookups/usps_abbreviations.html#suffix BTW The only "Pike" I heard was when colleagues in Watham MA refered to the old turnpike out of Boston as "Mass. 'Pike".


Title: PO box only?
Name: Cory
Date: 10/5/2009 2:52:28 PM
Comment:
This regex missed A LOT of valid addresses. I think matching the street type is a bad idea. You are never going to be able to account for all the variations because not every street is a St. or Ave. I had a street not match and it seems to be because it was "1234 Elm Hill Pike" I've never heard of a Pike, but no doubt there are countless other variations that this can never account for. I think the best way to handle this would be to match \d+\s+([\w]+[\s\b])+ instead of hard coding it in. That would match nearly anything and probably go the other way and produce a lot of false positives.


Copyright © 2001-2024, RegexAdvice.com | ASP.NET Tutorials