RegExLib.com - The first Regular Expression Library on the Web!

Please support RegExLib Sponsors

Sponsors

Regular Expression Details

Title Test Find Pattern Title
Expression
(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?
Description
*CORRECTED: Again thanks for all the comments below. If you want to include internal domain as well change the partial code (\.[\w-_]+)+ to (\.[\w-_]+)? See the comments below* This is the regular expression I use to add links in my email program. It also ignores those suppose-to-be commas/periods/colons at the end of the URL, like this sentence "check out http://www.yahoo.com/." (the period will be ignored) Note that it requires some modification to match ones that dont start with http.
Matches
http://regxlib.com/Default.aspx | http://electronics.cnet.com/electronics/0-6342366-8-8994967-1.html
Non-Matches
www.yahoo.com
Author Rating: The rating for this expression. M H
Source
Your Rating
Bad Good

Enter New Comment

Title

Name

Comment

Spammers suck - we apologize. Please enter the text shown below to enable your comment (not case sensitive - try as many times as you need to if the first ones are too hard):

Existing User Comments

Title: Problems with "Umlaute"
Name: LEXO
Date: 12/3/2013 3:29:23 AM
Comment:
I see there's a problem when the URL contains Umlaute like ä, ö or ü. In the past this was not an issue but in a modern CMS like Wordpress and UTF-8 charsets users are allowed to upload filenames containing Umlaute. Thus I slightly modified the Regex to: (http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:\/~\+#äöüÄÖÜ]*[\w\-\@?^=%&\/~\+#])?


Title: matches http://www.textlink
Name: czechmate1976
Date: 8/2/2013 10:55:47 AM
Comment:
the pattern matches domain name without the top level domain (e.g .com).


Title: Why &
Name: BurninLeo
Date: 12/27/2012 6:29:18 AM
Comment:
Thanks for doing improvements on the expression even after years! I do not fully understand th & in the expression. To my best knowledge, Regex does not work with HTML entities and [] does only work on single character base - therefore a, m, and p shall already be included in \w, shouldn't they?! @Abdel Hady: When your pattern delimiter is slash (/), please use a blackslash before each slash in the pattern. And when using PHP, of course, every backslash needs another backslash to escape for PHP.


Title: not working with php's preg_replace_callback
Name: Abdel Hady
Date: 4/28/2012 7:44:22 PM
Comment:
it gives me Fatal error: Uncaught exception 'Mysite_Exception_Warning' with message 'preg_replace_callback(): Unknown modifier '~''


Title: Say good
Name: Hunglt (VietNam)
Date: 2/5/2012 8:27:21 PM
Comment:
Thanks you


Title: Didn't run in Javascript
Name: Gk
Date: 9/21/2010 11:23:37 AM
Comment:
IMHO it should be: (http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:\/~\+#]*[\w\-\@?^=%&\/~\+#])?


Title: Wrong classes
Name: OD
Date: 5/4/2010 10:22:48 AM
Comment:
Also, the allowed characters, as defined by RFC 2141, are exactly the class [0-9a-zA-Z()+,.:=@;$_!*'%/?#-]


Title: Poor syntax
Name: OD
Date: 5/4/2010 10:15:40 AM
Comment:
Because of unnecessary escaping in the character classes, '\' is being included and it's not clear whether it should be. [\w\-_] is better written as [\w-] [\w\-\.,@?^=%&:/~\+#] is better written as [\w.,@?^=%&:/~+#-] etc.


Title: Improvements?
Name: Brad
Date: 9/11/2009 10:39:08 AM
Comment:
Great expression; check these out: http://regexlib.com/REDetails.aspx?regexp_id=2767 http://regexlib.com/REDetails.aspx?regexp_id=2766


Title: this matches http://www.yahoo when it shouldnt
Name: Andy
Date: 3/28/2009 3:35:38 PM
Comment:
I just tried the following PHP and it passes, when it should fail. Any tips? function testURL() { $urltocheck = 'http://www.yahoo'; if(preg_match("/^((http|ftp|https):\/\/|www\.)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:\/~\+#]*[\w\-\@?^=%&\/~\+#])?/", $urltocheck)) { echo "Pass"; return 1; } print "invalid"; return 0;


Title: Displayed wrong
Name: Regex Newbie
Date: 8/3/2008 3:24:20 PM
Comment:
It works good for me in VisualBasic 6, however it is displayed here wrong, it has been html encoded with &amp; instead of just & Now all I need is a regex that can get relative links out of an <a> tag for my spider.


Title: little mod
Name: Pawel Gruszecki
Date: 3/11/2008 7:37:23 PM
Comment:
^((http|ftp|https):\/\/|www\.)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:\/~\+#]*[\w\-\@?^=%&\/~\+#])? I've made a little correction in RegExp posted by John Brooking on 11/7/2007 2:51:56 AM cause php server returned an arror in expression. This works perfect. Matches: http://regxlib.com/Default.aspx http://electronics.cnet.com/electronics/0-6342366-8-8994967-1.html www.yahoo.com yahoo.com Non-Matches: http//regxlib.com/Default.aspx hppt://google.pl


Title: little mod
Name: Pawel Gruszecki
Date: 3/11/2008 7:37:03 PM
Comment:
^((http|ftp|https):\/\/|www\.)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:\/~\+#]*[\w\-\@?^=%&\/~\+#])? I've made a little correction in RegExp posted by John Brooking on 11/7/2007 2:51:56 AM cause php server returned an arror in expression. This works perfect. Matches: http://regxlib.com/Default.aspx http://electronics.cnet.com/electronics/0-6342366-8-8994967-1.html www.yahoo.com yahoo.com Non-Matches: http//regxlib.com/Default.aspx hppt://google.pl


Title: @ character
Name: Bob Hurt
Date: 1/4/2008 2:14:37 PM
Comment:
The '@' character doesn't have a special regular expression meaning, does it? If it does, what is the meaning? If it does not, why is the second '@' escaped with a backslash?


Title: Pattern Title - M H
Name: Candida
Date: 11/7/2007 2:51:56 AM
Comment:
Hi, this was really helpful....thanks for the post. :)


Title: To match www.yahoo.com
Name: John Brooking
Date: 10/12/2005 5:25:40 PM
Comment:
First, I've got to say that comments like "This expression does not work" are not helpful. I was able to get it to properly pick up the URL out of the string "The URL is www.my-domain.com?id=5&b=.&c=5.", and it even excluded the final period but got the others. What expression did you get it fail on? Maybe it can be fixed. Be a little helpful! Anyway, I'm mainly posting to say that the following variation ((http|ftp|https):\/\/|www\.)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])? seems to pick up the ones without the leading protocol string, as long as they start with "www.". Sure, they don't all start with www., but you gotta recognize it somehow. Maybe you could do more with looking for the final top-level string (com, org, us, uk, ...) and backwards from there. Anyhow, this works for my purposes, so I thought I'd share it. It *does* match "www.yahoo.com".


Title: Mr.
Name: Jon
Date: 9/17/2005 3:28:50 AM
Comment:
This expression does not work


Title: Span Multiple lines but not match whitespace
Name: MDR
Date: 5/9/2005 11:47:34 AM
Comment:
How could I make this span multiple lines but not match white spaces. For example http://www.msn.com/pl acestogo/default.aspx


Title: bad
Name: guili
Date: 2/1/2005 4:01:08 AM
Comment:
* Allows spaces in URLs * Allows more than one ? in URL


Title: expression does work
Name: akula
Date: 9/24/2004 6:37:35 AM
Comment:
Hi the below given http link does not match with the expression http://electronics.cnet.com/electronics/0-6342366-8-8994967-1.html


Title: Fix for domain-name.net
Name: Venata
Date: 9/16/2004 5:15:07 AM
Comment:
YOu need to replace [\w]+ with [\w\-]+


Title: Not match
Name: Venata
Date: 9/16/2004 5:11:33 AM
Comment:
It does not match http://some-domain.net/


Title: good for extracting urls
Name: maximilla
Date: 7/12/2004 4:16:53 PM
Comment:
gorgeous, thank you! works perfect for my screenscraper.


Title: ops
Name: M H
Date: 4/28/2004 4:09:21 PM
Comment:
Actually u should replace this partial code (\.[\w]+)+ to (\.[\w]+)?


Title: Depends on what u consider is "wrong"
Name: M H
Date: 4/28/2004 4:02:54 PM
Comment:
You could have internal URL like http://localhost/blahblahblah.html This is not a code to validate and verify the URL (e.g. you could put http://123.123/ and still get it work, but whether it's a "right" URL is beyond what this code is suppose to do) Anyway I've changed the code to "exclude" the one-part domain for general use but if you want to include internal URLs as well just remove the "{1,}" from the code.


Title: Not great
Name: Stephen
Date: 4/28/2004 3:30:56 PM
Comment:
Critic is right; doesn't require periods in the domain name. ie, 'http://www' passed


Title: Worked out swell
Name: TjoekBezoer
Date: 3/30/2004 8:40:33 PM
Comment:
Worked out perfect for me. Used for for checking a 404 paramter in IIS, and it did exactly what it needed to do.


Title: very bad don't use it
Name: critic
Date: 12/1/2003 6:49:39 AM
Comment:
it allows http://www. for example...


Copyright © 2001-2024, RegexAdvice.com | ASP.NET Tutorials