205

I want to use PHP to check, if string stored in $myoutput variable contains a valid link syntax or is it just a normal text. The function or solution, that I'm looking for, should recognize all links formats including the ones with GET parameters.

A solution, suggested on many sites, to actually query string (using CURL or file_get_contents() function) is not possible in my case and I would like to avoid it.

I thought about regular expressions or another solution.

2
  • Using CURL or getting it's HTTP contents may be slow, if you want something more speedy and almost as reliable, consider using gethostbyaddr() on the hostname. If it resolves to an IP, then it probably has a website. Of course this depends on your needs. Commented Jan 13, 2010 at 18:28
  • 1
    I would be interested in the use case for this. Commented Jun 26, 2021 at 6:47

13 Answers 13

421

You can use a native Filter Validator

filter_var($url, FILTER_VALIDATE_URL); 

Validates value as URL (according to » http://www.faqs.org/rfcs/rfc2396), optionally with required components. Beware a valid URL may not specify the HTTP protocol http:// so further validation may be required to determine the URL uses an expected protocol, e.g. ssh:// or mailto:. Note that the function will only find ASCII URLs to be valid; internationalized domain names (containing non-ASCII characters) will fail.

Example:

if (filter_var($url, FILTER_VALIDATE_URL) === FALSE) { die('Not a valid URL'); } 
Sign up to request clarification or add additional context in comments.

18 Comments

Be aware that FILTER_VALIDATE_URL will not validate the protocol of a url. So ssh://, ftp:// etc will pass.
@SephVelut expected behavior since these are valid URLs.
@Gordon Still important to point out the caveat. http/https is blankly seen as the de-facto aspect of url's.
@JoshHabdas, I think you're missing the point. The PHP code does exactly what it claims to do. But it can't read your mind. There's a huge difference between invalid and unwanted.. Unwanted is very subjective, which is why it's left to the programmer to work out that detail. You might also note the code validates the URL, but doesn't prove it exists. It's not PHP's fault that a user mistyped "amazon," "amozon," which would validate, but is still unwanted.
@Jeffz ttps://www.youtube.com is a syntactically valid URL. Mind the quote in the answer.
|
37

Here is the best tutorial I found over there:

http://www.w3schools.com/php/filter_validate_url.asp

<?php $url = "http://www.qbaki.com"; // Remove all illegal characters from a url $url = filter_var($url, FILTER_SANITIZE_URL); // Validate url if (filter_var($url, FILTER_VALIDATE_URL) !== false) { echo("$url is a valid URL"); } else { echo("$url is not a valid URL"); } ?> 

Possible flags:

FILTER_FLAG_SCHEME_REQUIRED - URL must be RFC compliant (like http://example) FILTER_FLAG_HOST_REQUIRED - URL must include host name (like http://www.example.com) FILTER_FLAG_PATH_REQUIRED - URL must have a path after the domain name (like www.example.com/example1/) FILTER_FLAG_QUERY_REQUIRED - URL must have a query string (like "example.php?name=Peter&age=37") 

5 Comments

@ErichGarcía this code doesn't check that it's a valid HTTP/S URL's like the OP asks. This will pass things like ssh://, ftp:// etc this only checks if its a syntactically valid URL according to RFC 2396
Do not use FILTER_VALIDATE_URL. It is messy and unreliable. E.g. it validates ttps://www.youtube.com as valid
The very necessary filter flags were removed as of PHP 8
@DomenicoDeFelice, filter_var returns "the filtered data", which is a string in this case, or a boolean false if the filter condition is not met. It won't return a boolean true value. So checking filter_var(...) === true just won't work.
Bad solution, remove illegal characters means change URL : $url = 'http://pt.wikipedia.org/wiki/Guimarães'; $url = filter_var($url, FILTER_SANITIZE_URL); $url finally gives pt.wikipedia.org/wiki/Guimares so it can be another route
26

Using filter_var() will fail for urls with non-ascii chars, e.g. (http://pt.wikipedia.org/wiki/Guimarães). The following function encode all non-ascii chars (e.g. http://pt.wikipedia.org/wiki/Guimar%C3%A3es) before calling filter_var().

Hope this helps someone.

<?php function validate_url($url) { $path = parse_url($url, PHP_URL_PATH); $encoded_path = array_map('urlencode', explode('/', $path)); $url = str_replace($path, implode('/', $encoded_path), $url); return filter_var($url, FILTER_VALIDATE_URL) ? true : false; } // example if(!validate_url("http://somedomain.com/some/path/file1.jpg")) { echo "NOT A URL"; } else { echo "IS A URL"; } 

6 Comments

This is it. Finally someone came back in 2017
Works for me (the others do not BTW) :)
This is the ONLY solution that worked for me. Thanks!
This is not a check which will get 100% correct results! This will only handle non-ascii characters in the path, not in the domain path of the URL. Nowadays, you can also use other unicode chars in the domain - which will be converted to punycode (see en.wikipedia.org/wiki/Punycode), e.g. "guimarães.org". So if you regard the non-punycode converted URLs as valid - your check will fail on these. Even if you handle this in the check, there is still the question of e.g. "ttps://mydomain.org" being falsely interpreted as valid! (as pointed out in other answers)
Not necessary anymore (at least for my PHP 7.4 installation)
|
11
function is_url($uri){ if(preg_match( '/^(http|https):\\/\\/[a-z0-9_]+([\\-\\.]{1}[a-z_0-9]+)*\\.[_a-z]{2,5}'.'((:[0-9]{1,5})?\\/.*)?$/i' ,$uri)){ return $uri; } else{ return false; } } 

2 Comments

Thanks for the answer, why the concat?
Does not work for me, returns true for "never-relapse-again.com a b c asd as d asd"
7

Actually... filter_var($url, FILTER_VALIDATE_URL); doesn't work very well. When you type in a real url, it works but, it only checks for http:// so if you type something like "http://weirtgcyaurbatc", it will still say it's real.

2 Comments

For intance FILTER_VALIDATE_URL validates ttps://www.youtube.com as valid
Dotless URLs have existed: en.wikipedia.org/wiki/Top-level_domain#Dotless_domains. ttps://www.youtube.com is also a valid URL and would be accessible if a Chrome extension or application registered the ttps:// protocol stackoverflow.com/questions/7087728/…. It would be a silly thing for someone to do, but filter_var isn't going to question how ridiculous a URL is, just if it is a valid URL.
6

Personally I would like to use regular expression here. Bellow code perfectly worked for me.

$baseUrl = url('/'); // for my case https://www.xrepeater.com $posted_url = "home"; // Test with one by one /*$posted_url = "/home"; $posted_url = "xrepeater.com"; $posted_url = "www.xrepeater.com"; $posted_url = "http://www.xrepeater.com"; $posted_url = "https://www.xrepeater.com"; $posted_url = "https://xrepeater.com/services"; $posted_url = "xrepeater.dev/home/test"; $posted_url = "home/test";*/ $regularExpression = "((https?|ftp)\:\/\/)?"; // SCHEME Check $regularExpression .= "([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?"; // User and Pass Check $regularExpression .= "([a-z0-9-.]*)\.([a-z]{2,3})"; // Host or IP Check $regularExpression .= "(\:[0-9]{2,5})?"; // Port Check $regularExpression .= "(\/([a-z0-9+\$_-]\.?)+)*\/?"; // Path Check $regularExpression .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?"; // GET Query String Check $regularExpression .= "(#[a-z_.-][a-z0-9+\$_.-]*)?"; // Anchor Check if(preg_match("/^$regularExpression$/i", $posted_url)) { if(preg_match("@^http|https://@i",$posted_url)) { $final_url = preg_replace("@(http://)+@i",'http://',$posted_url); // return "*** - ***Match : ".$final_url; } else { $final_url = 'http://'.$posted_url; // return "*** / ***Match : ".$final_url; } } else { if (substr($posted_url, 0, 1) === '/') { // return "*** / ***Not Match :".$final_url."<br>".$baseUrl.$posted_url; $final_url = $baseUrl.$posted_url; } else { // return "*** - ***Not Match :".$posted_url."<br>".$baseUrl."/".$posted_url; $final_url = $baseUrl."/".$final_url; } } 

1 Comment

This is the best answer to validate websites URL. With few changes this work perfectly. Thanks
4

You can use this function, but its will return false if website offline.

 function isValidUrl($url) { $url = parse_url($url); if (!isset($url["host"])) return false; return !(gethostbyname($url["host"]) == $url["host"]); } 

Comments

4

Given issues with filter_var() needing http://, I use:

$is_url = filter_var($filename, FILTER_VALIDATE_URL) || array_key_exists('scheme', parse_url($filename));

3 Comments

Do not use FILTER_VALIDATE_URL. It is messy and unreliable. E.g. it validates ttps://www.youtube.com as valid
@Jeffz FILTER_VALIDATE_URL does validate urls. A scheme is not limited to http or https only, these are all valid schemes ftp, mailto, file, data and irc. They are registered with IANA but also non registered schemes can be used. So as per URI definition ttps is a valid scheme
@MarinaDunst Yeah but kkdjf://www.youtube.com is valid too according to FILTER_VALIDATE_URL. It's definitely unreliable.
3

Another way to check if given URL is valid is to try to access it, below function will fetch the headers from given URL, this will ensure that URL is valid AND web server is alive:

function is_url($url){ $response = array(); //Check if URL is empty if(!empty($url)) { $response = get_headers($url); } return (bool)in_array("HTTP/1.1 200 OK", $response, true); /*Array ( [0] => HTTP/1.1 200 OK [Date] => Sat, 29 May 2004 12:28:14 GMT [Server] => Apache/1.3.27 (Unix) (Red-Hat/Linux) [Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT [ETag] => "3f80f-1b6-3e1cb03b" [Accept-Ranges] => bytes [Content-Length] => 438 [Connection] => close [Content-Type] => text/html )*/ } 

2 Comments

Nice idea. This will fail if the server is using HTTP/1.0 or HTTP/2.0, or returns a redirect.
Yes, it is a starting point, further improvements can be done easily.
1

Came across this article from 2012. It takes into account variables that may or may not be just plain URLs.

The author of the article, David Müeller, provides this function that he says, "...could be worth wile [sic]," along with some examples of filter_var and its shortcomings.

/** * Modified version of `filter_var`. * * @param mixed $url Could be a URL or possibly much more. * @return bool */ function validate_url( $url ) { $url = trim( $url ); return ( ( strpos( $url, 'http://' ) === 0 || strpos( $url, 'https://' ) === 0 ) && filter_var( $url, FILTER_VALIDATE_URL, FILTER_FLAG_SCHEME_REQUIRED || FILTER_FLAG_HOST_REQUIRED ) !== false ); } 

2 Comments

Works better than simple filter_var, but also validates youtube, which basically is a valid url, but a local one (without tld)
FILTER_FLAG_ will now be removed in php 8.0, so this seems to be no loger an option.
0
public function testing($Url=''){ $ch = curl_init($Url); curl_setopt($ch, CURLOPT_TIMEOUT, 5); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); $data = curl_exec($ch); $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE); curl_close($ch); return ($httpcode >= 200 && $httpcode < 300) ? true : false; } 

2 Comments

Please add some explanation to your answer such that others can learn from it. Where does $this->output come from?
have made it more clear now
0

There are three separated function I wrote in this case, I hope be useful:

/** * Check if the string is a relative or absolute URL * @param null|string $url The url string * @return bool */ function isUrl(string|null $url):bool{ return (!empty($url)) && preg_match("/^(\w+\:[\/]*)?(\/?[^\/\{\}\|^\[\]\"`\r\n\t\f]){1,}$/",$url); } /** * Check if the string is only a relative URL * @param null|string $url The url string * @return bool */ function isRelativeUrl(string|null $url):bool{ return (!empty($url)) && preg_match("/^(\/?[^\/\{\}\|\^\[\]\"\`\r\n\t\f]){1,}$/",$url); } /** * Check if the string is only an absolute URL * @param null|string $url The url string * @return bool */ function isAbsoluteUrl(string|null $url):bool{ return (!empty($url)) && preg_match("/^\w+\:\/*(\/?[^\/\{\}\|^\[\]\"\`\r\n\t\f]){1,}$/",$url); } 

Enjoy...

Comments

-2

if anyone is interested to use the cURL for validation. You can use the following code.

<?php public function validationUrl($Url){ if ($Url == NULL){ return $false; } $ch = curl_init($Url); curl_setopt($ch, CURLOPT_TIMEOUT, 5); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); $data = curl_exec($ch); $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE); curl_close($ch); return ($httpcode >= 200 && $httpcode < 300) ? true : false; } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.