Python urllib2, basic HTTP authentication, and tr.im

Question

I'm playing around, trying to write some code to use the tr.im APIs to shorten a URL.

After reading http://docs.python.org/library/urllib2.html, I tried:

 TRIM_API_URL = 'http://api.tr.im/api' auth_handler = urllib2.HTTPBasicAuthHandler() auth_handler.add_password(realm='tr.im', uri=TRIM_API_URL, user=USERNAME, passwd=PASSWORD) opener = urllib2.build_opener(auth_handler) urllib2.install_opener(opener) response = urllib2.urlopen('%s/trim_simple?url=%s' % (TRIM_API_URL, url_to_trim)) url = response.read().strip()

response.code is 200 (I think it should be 202). url is valid, but the basic HTTP authentication doesn't seem to have worked, because the shortened URL isn't in my list of URLs (at http://tr.im/?page=1).

After reading http://www.voidspace.org.uk/python/articles/authentication.shtml#doing-it-properly I also tried:

 TRIM_API_URL = 'api.tr.im/api' password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm() password_mgr.add_password(None, TRIM_API_URL, USERNAME, PASSWORD) auth_handler = urllib2.HTTPBasicAuthHandler(password_mgr) opener = urllib2.build_opener(auth_handler) urllib2.install_opener(opener) response = urllib2.urlopen('http://%s/trim_simple?url=%s' % (TRIM_API_URL, url_to_trim)) url = response.read().strip()

But I get the same results. (response.code is 200 and url is valid, but not recorded in my account at http://tr.im/.)

If I use query string parameters instead of basic HTTP authentication, like this:

 TRIM_API_URL = 'http://api.tr.im/api' response = urllib2.urlopen('%s/trim_simple?url=%s&username=%s&password=%s' % (TRIM_API_URL, url_to_trim, USERNAME, PASSWORD)) url = response.read().strip()

...then not only is url valid but it's recorded in my tr.im account. (Though response.code is still 200.)

There must be something wrong with my code though (and not tr.im's API), because

$ curl -u yacitus:xxxx http://api.tr.im/api/trim_url.json?url=http://www.google.co.uk

...returns:

{"trimpath":"hfhb","reference":"nH45bftZDWOX0QpVojeDbOvPDnaRaJ","trimmed":"11\/03\/2009","destination":"http:\/\/www.google.co.uk\/","trim_path":"hfhb","domain":"google.co.uk","url":"http:\/\/tr.im\/hfhb","visits":0,"status":{"result":"OK","code":"200","message":"tr.im URL Added."},"date_time":"2009-03-11T10:15:35-04:00"}

...and the URL does appear in my list of URLs on http://tr.im/?page=1.

And if I run:

$ curl -u yacitus:xxxx http://api.tr.im/api/trim_url.json?url=http://www.google.co.uk

...again, I get:

{"trimpath":"hfhb","reference":"nH45bftZDWOX0QpVojeDbOvPDnaRaJ","trimmed":"11\/03\/2009","destination":"http:\/\/www.google.co.uk\/","trim_path":"hfhb","domain":"google.co.uk","url":"http:\/\/tr.im\/hfhb","visits":0,"status":{"result":"OK","code":"201","message":"tr.im URL Already Created [yacitus]."},"date_time":"2009-03-11T10:15:35-04:00"}

Note code is 201, and message is "tr.im URL Already Created [yacitus]."

I must not be doing the basic HTTP authentication correctly (in either attempt). Can you spot my problem? Perhaps I should look and see what's being sent over the wire? I've never done that before. Are there Python APIs I can use (perhaps in pdb)? Or is there another tool (preferably for Mac OS X) I can use?

the site must return "WWW-Authenticate" and code 401 before urllib2 (or httplib2) will send your credentials. See my answer below. — Mark Mikofski
– Mark Mikofski, Commented Mar 15, 2012 at 22:29

Ben Keating · Accepted Answer · 2010-11-15 20:52:54Z

247

This seems to work really well (taken from another thread)

import urllib2, base64 request = urllib2.Request("http://api.foursquare.com/v1/user") base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '') request.add_header("Authorization", "Basic %s" % base64string) result = urllib2.urlopen(request)

answered Nov 15, 2010 at 20:52

Ben Keating

8,3869 gold badges39 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Paweł Polewicz Over a year ago

Instead of base64.encodestring and replace, use base64.standard_b64encode

jfs Over a year ago

request.add_header('Authorization', b'Basic ' + base64.b64encode(username + b':' + password))

mcepl Over a year ago

Based on this answer I have created a package urllib2_prior_auth which has no dependencies outside of stdlib, and I try to push the relevant change to stdlib.

makapuf Over a year ago

Or even shorter / avoiding a import : request.add_header('Authorization', b'Basic ' + (username + b':' + password).encode('base64'))

Mark Mikofski · Accepted Answer · 2014-09-05 09:04:54Z

20

Really cheap solution:

urllib.urlopen('http://user:[email protected]/api')

(which you may decide is not suitable for a number of reasons, like security of the url)

Github API example:

>>> import urllib, json >>> result = urllib.urlopen('https://personal-access-token:[email protected]/repos/:owner/:repo') >>> r = json.load(result.fp) >>> result.close()

edited Sep 5, 2014 at 9:04

Mark Mikofski

20.3k2 gold badges61 silver badges94 bronze badges

answered Mar 11, 2009 at 15:50

Ali Afshar

41.8k12 gold badges98 silver badges111 bronze badges

5 Comments

Daryl Spitzer Over a year ago

Are there any advantages to this over using query string parameters?

Ali Afshar Over a year ago

Daryl: if it works, I would say that it's an advantage yes, and probably more secure than query string arguments as most http clients are a bit more careful about how they handle them.

Daryl Spitzer Over a year ago

I'll probably go with this (so you get my upvote), but I'd still like to figure out what's wrong with my code (so this won't be my accepted answer).

Nick Bolton Over a year ago

This returns an error... InvalidURL: nonnumeric port: '[email protected]/api'

CantGetANick Over a year ago

@nbolton make sure you are not using urllib2.urlopen(url)

Community · Accepted Answer · 2017-05-23 12:26:00Z

Take a look at this SO post answer and also look at this basic authentication tutorial from the urllib2 missing manual.

In order for urllib2 basic authentication to work, the http response must contain HTTP code 401 Unauthorized and a key "WWW-Authenticate" with the value "Basic" otherwise, Python won't send your login info, and you will need to either use Requests, or urllib.urlopen(url) with your login in the url, or add a the header like in @Flowpoke's answer.

You can view your error by putting your urlopen in a try block:

try: urllib2.urlopen(urllib2.Request(url)) except urllib2.HTTPError, e: print e.headers print e.headers.has_key('WWW-Authenticate')

This helped me because printing the headers lead me to realise I had typo'd the authentication realm. +1

jfs · Accepted Answer · 2015-11-03 21:39:08Z

The recommended way is to use requests module:

#!/usr/bin/env python import requests # $ python -m pip install requests ####from pip._vendor import requests # bundled with python url = 'https://httpbin.org/hidden-basic-auth/user/passwd' user, password = 'user', 'passwd' r = requests.get(url, auth=(user, password)) # send auth unconditionally r.raise_for_status() # raise an exception if the authentication fails

Here's a single source Python 2/3 compatible urllib2-based variant:

#!/usr/bin/env python import base64 try: from urllib.request import Request, urlopen except ImportError: # Python 2 from urllib2 import Request, urlopen credentials = '{user}:{password}'.format(**vars()).encode() urlopen(Request(url, headers={'Authorization': # send auth unconditionally b'Basic ' + base64.b64encode(credentials)})).close()

Python 3.5+ introduces HTTPPasswordMgrWithPriorAuth() that allows:

..to eliminate unnecessary 401 response handling, or to unconditionally send credentials on the first request in order to communicate with servers that return a 404 response instead of a 401 if the Authorization header is not sent..

#!/usr/bin/env python3 import urllib.request as urllib2 password_manager = urllib2.HTTPPasswordMgrWithPriorAuth() password_manager.add_password(None, url, user, password, is_authenticated=True) # to handle 404 variant auth_manager = urllib2.HTTPBasicAuthHandler(password_manager) opener = urllib2.build_opener(auth_manager) opener.open(url).close()

It is easy to replace HTTPBasicAuthHandler() with ProxyBasicAuthHandler() if necessary in this case.

HTTPPasswordMgrWithPriorAuth and is_authenticated=True is the key!!!

mcepl · Accepted Answer · 2014-09-10 21:38:56Z

4

I would suggest that the current solution is to use my package urllib2_prior_auth which solves this pretty nicely (I work on inclusion to the standard lib.

answered Sep 10, 2014 at 21:38

mcepl

2,8362 gold badges24 silver badges42 bronze badges

1 Comment

mcepl Over a year ago

It has been included into Python 3.5 as urrlib.request.HTTPBasicPriorAuthHandler

Community · Accepted Answer · 2017-05-23 12:10:04Z

Same solutions as Python urllib2 Basic Auth Problem apply.

see https://stackoverflow.com/a/24048852/1733117; you can subclass urllib2.HTTPBasicAuthHandler to add the Authorization header to each request that matches the known url.

class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler): '''Preemptive basic auth. Instead of waiting for a 403 to then retry with the credentials, send the credentials if the url is handled by the password manager. Note: please use realm=None when calling add_password.''' def http_request(self, req): url = req.get_full_url() realm = None # this is very similar to the code from retry_http_basic_auth() # but returns a request object. user, pw = self.passwd.find_user_password(realm, url) if pw: raw = "%s:%s" % (user, pw) auth = 'Basic %s' % base64.b64encode(raw).strip() req.add_unredirected_header(self.auth_header, auth) return req https_request = http_request

Andrew G · Accepted Answer · 2012-11-11 02:44:58Z

0

Try python-request or python-grab

answered Nov 11, 2012 at 2:44

Andrew G

8271 gold badge9 silver badges13 bronze badges

Collectives™ on Stack Overflow

Python urllib2, basic HTTP authentication, and tr.im

7 Answers 7

4 Comments

5 Comments

1 Comment

1 Comment

1 Comment

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

4 Comments

5 Comments

1 Comment

1 Comment

1 Comment

1 Comment

Comments

Linked

Related