In search of the perfect URL validation regex

submited by
Style Pass
2021-09-25 17:30:04

To clarify, I’m looking for a decent regular expression to validate URLs that were entered as user input with. I have no interest in parsing a list of URLs from a given string of text (even though some of the regexes on this page are capable of doing that). I also don’t want to allow every possible technically valid URL — quite the opposite. See the URL Standard if you’re looking to parse URLs in the same way that browsers do.

Assume that this regex will be used for a public URL shortener written in PHP, so URLs like http://localhost/, //foo.bar/, ://foo.bar/, data:text/plain;charset=utf-8,OHAI and tel:+1234567890 shouldn’t pass (even though they’re technically valid). Also, in this case I only want to allow the HTTP, HTTPS and FTP protocols.

Also, single weird leading and/or trailing characters aren’t tested for. Just imagine you’re doing this before testing $url with the regex: $url = trim($url, '!"#$%&\'()*+,-./@:;<=>[\\]^_`{|}~');

Leave a Comment