5

I am trying to open a page using urllib2 but i keep getting connection timed out errors.
The line which i am using is: f = urllib2.urlopen(url)

exact error is:

URLError: <urlopen error [Errno 110] Connection timed out>

3
  • can you load the url in browser? Commented Jul 7, 2010 at 17:31
  • yes, the url loads in the browser. I think the problem is maybe with my connection settings. How does the python urllib2 connect to internet? Commented Jul 7, 2010 at 17:39
  • 2
    I was behind proxy and now i am not. Only change I made was to remove http_proxy and https_proxy. Commented Jul 7, 2010 at 17:47

2 Answers 2

5

urllib2 respects robots.txt. Many sites block the default User-Agent.

Try adding a new User-Agent, by creating Request objects & using them as arguments for urlopen:

import urllib2 request = urllib2.Request('http://www.example.com/') request.add_header('User-agent', 'Mozilla/5.0 (Linux i686)') response = urllib2.urlopen(request) 

Several detailed walk-throughs are available, such as http://www.doughellmann.com/PyMOTW/urllib2/

Sign up to request clarification or add additional context in comments.

1 Comment

It seems unlikely that it respects robots.txt. This would require urllib2 to do an additional network request to grab the file. Sites may well block certain user agents though, but this is a different thing.
2

As a general strategy, open wireshark and watch the traffic generated by urllib2.urlopen(url). You may be able to see where the error is coming from.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.