9

The following code sends a request every 200ms and should handle the responses asynchronously whenever they come.

Over HTTP it works as expected - a request is sent every 200ms and the response callback is called independently whenever a response arrives. Over HTTPS however, the requests get significantly delayed whenever a response arrives (even though my response handler does no work). The response callback seems to be called twice for each request, once with a zero-length response (edit: this is because of a redirect and seems unrelated to the blocking issue, thanks Padraic).

What could be causing this blocking behaviour over HTTPS? (www.bbc.co.uk is just an example that's geographically far from me, but it happens with all servers I've tested).

grequests_test.py

import time import sys import grequests import gevent def cb(res, **kwargs): print("**** Response", time.time(), len(res.text)) for i in range(10): unsent = grequests.get(sys.argv[1], hooks={'response': cb}) print("Request", time.time()) grequests.send(unsent, grequests.Pool(1)) gevent.sleep(0.2) gevent.sleep(5) 

$ ipython2 grequests_test.py 'http://www.bbc.co.uk' (expected result)

('Request', 1459050191.499266) ('Request', 1459050191.701466) ('Request', 1459050191.903223) ('Request', 1459050192.10403) ('Request', 1459050192.305626) ('**** Response', 1459050192.099185, 179643) ('Request', 1459050192.506476) ('**** Response', 1459050192.307869, 179643) ('Request', 1459050192.707745) ('**** Response', 1459050192.484711, 179643) ('Request', 1459050192.909376) ('**** Response', 1459050192.696583, 179643) ('Request', 1459050193.110528) ('**** Response', 1459050192.870476, 179643) ('Request', 1459050193.311601) ('**** Response', 1459050193.071679, 179639) ('**** Response', 1459050193.313615, 179680) ('**** Response', 1459050193.4959, 179643) ('**** Response', 1459050193.687054, 179680) ('**** Response', 1459050193.902827, 179639) 

ipython2 grequests_test.py 'https://www.bbc.co.uk' (requests are sent late)

('Request', 1459050203.24336) ('Request', 1459050203.44473) ('**** Response', 1459050204.423302, 0) ('Request', 1459050204.424748) <------------- THIS REQUEST TIME IS LATE ('**** Response', 1459050205.294426, 0) ('Request', 1459050205.296722) ('Request', 1459050205.497924) ('**** Response', 1459050206.456572, 0) ('Request', 1459050206.457875) ('**** Response', 1459050207.363188, 0) ('**** Response', 1459050208.247189, 0) ('Request', 1459050208.249579) ('**** Response', 1459050208.250645, 179643) ('**** Response', 1459050208.253638, 179643) ('Request', 1459050208.451083) ('**** Response', 1459050209.426556, 0) ('Request', 1459050209.428032) ('**** Response', 1459050209.428929, 179643) ('**** Response', 1459050210.331425, 0) ('**** Response', 1459050211.247793, 0) ('Request', 1459050211.251574) ('**** Response', 1459050211.252321, 179643) ('**** Response', 1459050211.25519, 179680) ('**** Response', 1459050212.397186, 0) ('**** Response', 1459050213.299109, 0) ('**** Response', 1459050213.588854, 179588) ('**** Response', 1459050213.590434, 179643) ('**** Response', 1459050213.593731, 179643) ('**** Response', 1459050213.90507, 179643) ('**** Response', 1459050213.909386, 179643) 

Notice that the first response appears to arrive long after the next request should have been sent but wasn't. Why didn't the sleep return, and the next request get sent, before that first response arrived?

0

2 Answers 2

1

The extra responses and the 0 length responses are easily explained, if you add a print(res.status_code) you will see a lot of 301's as in the case of https://www.bbc.co.uk you get redirected to http://www.bbc.co.uk so that is why you see extras responses and 0 returned for the len(res.text), you can see the output below:

In [11]: def cb(res, **kwargs): ....: print(res.status_code) ....: print("**** Response", time.time(), len(res.text)) ....: In [12]: for i in range(10): ....: unsent = grequests.get("https://www.bbc.co.uk", hooks={'response': cb}) ....: print("Request", time.time()) ....: grequests.send(unsent, grequests.Pool(1)) ....: gevent.sleep(0.2) ....: gevent.sleep(5) ....: ('Request', 1459368704.32843) 301 ('**** Response', 1459368704.616453, 0) ('Request', 1459368704.618786) 301 ('**** Response', 1459368704.937159, 0) ('Request', 1459368704.941069) 200 ('**** Response', 1459368704.943034, 141486) 301 ('**** Response', 1459368705.496423, 0) ('Request', 1459368705.498991) 200 ('**** Response', 1459368705.50162, 141448) 301 ('**** Response', 1459368705.784145, 0) ('Request', 1459368705.785769) 200 ('**** Response', 1459368705.786772, 141486) 301 ('**** Response', 1459368706.110865, 0) ('Request', 1459368706.114921) 200 ('**** Response', 1459368706.116124, 141448) 301 ('**** Response', 1459368706.396807, 0) ('Request', 1459368706.400795) 200 301 ('**** Response', 1459368706.756861, 0) ('Request', 1459368706.76069) 200 ('**** Response', 1459368706.763268, 141448) ('**** Response', 1459368706.488708, 141448) 301 ('**** Response', 1459368707.065011, 0) ('Request', 1459368707.069128) 200 ('**** Response', 1459368707.071981, 141448) 301 ('**** Response', 1459368707.366737, 0) ('Request', 1459368707.370713) 200 ('**** Response', 1459368707.373597, 141448) 301 ('**** Response', 1459368707.73689, 0) 200 ('**** Response', 1459368707.743815, 141448) 200 ('**** Response', 1459368707.902499, 141448) 

If we run the same code using a site that serves over https, https://www.google.ie/ in this example:

In [14]: for i in range(10): ....: unsent = grequests.get("https://www.google.ie/", hooks={'response': cb}) ....: print("Request", time.time()) ....: grequests.send(unsent, grequests.Pool(1)) ....: gevent.sleep(0.2) ....: gevent.sleep(5) ....: ('Request', 1459368895.525717) 200 ('**** Response', 1459368895.838453, 19682) ('Request', 1459368895.884151) 200 ('**** Response', 1459368896.168789, 19650) ('Request', 1459368896.22553) 200 ('**** Response', 1459368896.491304, 19632) ('Request', 1459368896.542206) 200 ('**** Response', 1459368896.808875, 19650) ('Request', 1459368896.850575) 200 ('**** Response', 1459368897.144725, 19705) ('Request', 1459368897.173744) 200 ('**** Response', 1459368897.45713, 19649) ('Request', 1459368897.491821) 200 ('**** Response', 1459368897.761675, 19657) ('Request', 1459368897.792373) 200 ('**** Response', 1459368898.331791, 19683) ('Request', 1459368898.350483) 200 ('**** Response', 1459368898.836108, 19713) ('Request', 1459368898.855729) 200 ('**** Response', 1459368899.148171, 19666) 

You will see the behaviour is different. We get 10 responses and no 0 length responses. You should check the status_code in your function to verify you get what you want. The example above explains what you see for the bbc site and most likely what is happening for the others.

Sign up to request clarification or add additional context in comments.

1 Comment

Ok, that explains the 0 length responses. You're right, it doesn't happen on other domains. Any idea about the blocking?
1

The current iteration of grequests contains the following:

from gevent import monkey as curious_george curious_george.patch_all(thread=False, select=False) 

The offending part is select=False - removing this or manually calling monkey.patch_select() solves the problem. I am not sure if this has other side effects.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.