Is it "safe" to use $_SERVER['HTTP_HOST'] for all links on a site without having to worry about XSS attacks, even when used in forms?
Yes, it's safe to use $_SERVER['HTTP_HOST'], (and even $_GET and $_POST) as long as you verify them before accepting them. This is what I do for secure production servers:
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */ $reject_request = true; if(array_key_exists('HTTP_HOST', $_SERVER)){ $host_name = $_SERVER['HTTP_HOST']; // [ need to cater for `host:port` since some "buggy" SAPI(s) have been known to return the port too, see https://stackoverflow.com/questions/1459739/php-serverhttp-host-vs-serverserver-name-am-i-understanding-the-ma/12046836#12046836 $strpos = strpos($host_name, ':'); if($strpos !== false){ $host_name = substr($host_name, $strpos); } // ] // [ for dynamic verification, replace this chunk with db/file/curl queries $reject_request = !array_key_exists($host_name, array( 'a.com' => null, 'a.a.com' => null, 'b.com' => null, 'b.b.com' => null )); // ] } if($reject_request){ // log errors // display errors (optional) exit; } /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */ echo 'Hello World!'; // ...
The advantage of $_SERVER['HTTP_HOST'] is that its behavior is more well-defined than $_SERVER['SERVER_NAME']. Contrast ➫➫:
Contents of the Host: header from the current request, if there is one.
with:
The name of the server host under which the current script is executing.
Using a better defined interface like $_SERVER['HTTP_HOST'] means that more SAPIs will implement it using reliable well-defined behavior. (Unlike the other.) However, it is still totally SAPI dependent ➫➫:
There is no guarantee that every web server will provide any of these [$_SERVER entries]; servers may omit some, or provide others not listed here.
To understand how to properly retrieve the host name, first and foremost you need to understand that a server which contains only code has no means of knowing (pre-requisite for verifying) its own name on the network. It needs to interface with a component that supplies it its own name. This can be done via:
Usually its done via the local (SAPI) config file. Note that you have configured it correctly, e.g. in Apache ➫➫:
A couple of things need to be 'faked' to make the dynamic virtual host look like a normal one.
The most important is the server name which is used by Apache to generate self-referential URLs, etc. It is configured with the ServerName directive, and it is available to CGIs via the SERVER_NAME environment variable.
The actual value used at run time is controlled by the UseCanonicalName setting.
With UseCanonicalName Off the server name comes from the contents of the Host: header in the request. With UseCanonicalName DNS it comes from a reverse DNS lookup of the virtual host's IP address. The former setting is used for name-based dynamic virtual hosting, and the latter is used for** IP-based hosting.
If Apache cannot work out the server name because there is no Host: header or the DNS lookup fails then the value configured with ServerName is used instead.