I'm trying to determine whether it is a bug that Python's urllib.urlopen() function omits an HTTP Accept header when making simple REST API requests.
The Facebook Graph API seems to notice whether the header is present or not:
GET /zuck HTTP/1.0 Host: graph.facebook.com Accept: */* Without the accept header, the returned content-type of application/json; charset=UTF-8 becomes text/javascript; charset=UTF-8. That may be a bug in Facebook's REST API or it may be a legitimate response to a missing accept header.
I notice the command-line tools like curl use Accept: */* by default:
$ curl -v https://graph.facebook.com/zuck > GET /zuck HTTP/1.1 > User-Agent: curl/7.30.0 > Host: graph.facebook.com > Accept: */* Likewise, the Python requests package also uses Accept: */* as a default:
def default_headers(): return CaseInsensitiveDict({ 'User-Agent': default_user_agent(), 'Accept-Encoding': ', '.join(('gzip', 'deflate')), 'Accept': '*/*', 'Connection': 'keep-alive', }) I presume that curl and requests add the default for a reason, but I'm not sure what that reason is.
RFC 2616 for HTTP/1.1 says that */* indicates all media types and that if no Accept header field is present, then it is assumed that the client accepts all media types. This would seem to indicate that Accept: */* is optional and its omission would have no effect. That said, Python is using HTTP/1.0 and the RFCs are silent about the effect of omitting the header.
I would like to determine whether the best practice is to include Accept: */* as curl and requests do or whether it is okay to omit is as Python's urllib.urlopen() does.
The question is important because I'm in a position to fix urllib.urlopen() if it is determined to be buggy or if it is problematic for use with REST APIs as commonly implemented using HTTP/1.0:
>>> import httplib >>> httplib.HTTPConnection.debuglevel = 1 >>> import urllib >>> u = urllib.urlopen('https://graph.facebook.com/zuck') send: 'GET /zuck HTTP/1.0\r\nHost: graph.facebook.com\r\nUser-Agent: Python-urllib/1.17\r\n\r\n' The related questions on StackOverflow aren't helpful for this question. What does 'Accept: */*' mean under Client section of Request Headers? asks what */* means (we already know that it means all media types) and Send a curl request with no Accept header? asks how to omit the accept header in a curl request. My question focuses on whether you should include */* and whether it is a bug to omit it.