Title: Great Regex but not full proof
Name: viren negi
Date: 9/23/2009 2:16:38 AM
Comment:
I must say by far your regex is far more robust and better compare to other regex's that one can find over the INTENET in distinguishing between malformed and proper URL .But it still feel it lack some basic functionality check like
1.If a URL has "https". I know it simple one just need to append "s?" after "http" of your regex but still.
2.The regex treat a URL as proper and not malformed even if a URL contain and "_".
e.g https://www._google.com is identified as proper and not malformed URL
(above URL is been checked against your regex in http://rubular.com)
to get rid of this I just removed all the '_' form your regex.
hope you don't mind.
It works fien then I wonder what are those '_' doing there.
Hope that you put more light on to it in your future comment
3. the regex treat the URL http://www.google.com/ as malformed.which its shouldn't have done.
rest all is fine
Great work Man.Truly Great
Carry on with such wonderful work
and please make the necessary change in your regex for the betterment of all our fellow developers
Title: URI.pm is insecure
Name: Ted Cambron
Date: 6/9/2008 9:13:16 PM
Comment:
OK, once again for all you purists. The URI.pm family is insecure. Every hacker knows it. A double dot is a valid URI and also a good way to do a traversal hack that can wipe out a database. Using URI.pm will not prevent this. Using my regex will. Why use URI.pm and then use another piece of code to loop through the data and look for double dots when all you need is one simple regex? Thank you sir and good day.
Title: URI.pm is insecure
Name: Ted Cambron
Date: 6/9/2008 9:13:05 PM
Comment:
OK, once again for all you purists. The URI.pm family is insecure. Every hacker knows it. A double dot is a valid URI and also a good way to do a traversal hack that can wipe out a database. Using URI.pm will not prevent this. Using my regex will. Why use URI.pm and then use another piece of code to loop through the data and look for double dots when all you need is one simple regex? Thank you sir and good day.
Title: Not so hot
Name: Meno d'Emme
Date: 4/21/2008 10:41:20 PM
Comment:
The failure cases are valid URIs and there are many more valid's that the Rx will not match. The Rx will also match things that are invalid. The URI.pm family is core in Perl and does these things correctly and with more power and flexibility.
This seems to be case will all the Rxes posted by this author. I.e., there is a core, or well known, module that will do it right and the Rxes in question fail on valid cases and pass certain bad ones.
Title: Not so hot
Name: Meno d'Emme
Date: 4/21/2008 10:41:05 PM
Comment:
The failure cases are valid URIs and there are many more valid's that the Rx will not match. The Rx will also match things that are invalid. The URI.pm family is core in Perl and does these things correctly and with more power and flexibility.
This seems to be case will all the Rxes posted by this author. I.e., there is a core, or well known, module that will do it right and the Rxes in question fail on valid cases and pass certain bad ones.
Title: Malfunction
Name: WolfgangBadura@aon.at
Date: 4/17/2008 9:29:34 AM
Comment:
Using the URL
http://regexlib.com/UserPatterns.aspx?authorId=4f1e9e8d-d9fa-4221-ac16-ee9534263d28
there is no match and also no match using the .net engine or the clients engine.
Therefore I would offer:
^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_\-]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_\-]+=[a-zA-Z0-9_\-]+)?)?(?:\&[a-zA-Z0-9_\-]+\=[a-zA-Z0-9_\-]+)*)$
Title: Update
Name: Ted Cambron
Date: 7/28/2007 12:36:21 AM
Comment:
Dang, a double post. :(
After further test I made one more change and it's working quite well. Here's the finished product...
^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$
Title: Update
Name: Ted Cambron
Date: 7/28/2007 12:34:11 AM
Comment:
Had to make a slight upate to the query part.
^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]?)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$
Title: Update
Name: Ted Cambron
Date: 7/28/2007 12:25:29 AM
Comment:
Had to make a slight upate to the query part.
^(http\:\/\/[a-zA-Z0-9_\-]+(?:\.[a-zA-Z0-9_\-]+)*\.[a-zA-Z]{2,4}(?:\/[a-zA-Z0-9_]+)*(?:\/[a-zA-Z0-9_]+\.[a-zA-Z]{2,4}(?:\?[a-zA-Z0-9_]+\=[a-zA-Z0-9_]?)?)?(?:\&[a-zA-Z0-9_]+\=[a-zA-Z0-9_]+)*)$