Now i am studying how to fetch data from website as fast as possible. To get faster speed, im considering using multi-thread. Here is the code i used to test the difference between multi-threaded and simple post.
import threading import time import urllib import urllib2 class Post: def __init__(self, website, data, mode): self.website = website self.data = data #mode is either "Simple"(Simple POST) or "Multiple"(Multi-thread POST) self.mode = mode def post(self): #post data req = urllib2.Request(self.website) open_url = urllib2.urlopen(req, self.data) if self.mode == "Multiple": time.sleep(0.001) #read HTMLData HTMLData = open_url.read() print "OK" if __name__ == "__main__": current_post = Post("http://forum.xda-developers.com/login.php", "vb_login_username=test&vb_login_password&securitytoken=guest&do=login", \ "Simple") #save the time before post data origin_time = time.time() if(current_post.mode == "Multiple"): #multithreading POST for i in range(0, 10): thread = threading.Thread(target = current_post.post) thread.start() thread.join() #calculate the time interval time_interval = time.time() - origin_time print time_interval if(current_post.mode == "Simple"): #simple POST for i in range(0, 10): current_post.post() #calculate the time interval time_interval = time.time() - origin_time print time_interval just as you can see, this is a very simple code. first i set the mode to "Simple", and i can get the time interval: 50s(maybe my speed is a little slow :(). then i set the mode to "Multiple", and i get the time interval: 35. from that i can see, multi-thread can actually increase the speed, but the result isnt as good as i imagine. i want to get a much faster speed.
from debugging, i found that the program mainly blocks at the line: open_url = urllib2.urlopen(req, self.data), this line of code takes a lot of time to post and receive data from the specified website. i guess maybe i can get a faster speed by adding time.sleep() and using multi-threading inside the urlopen function, but i cannot do that because its the python's own function.
if not considering the prossible limits that the server blocks the post speed, what else can i do to get the faster speed? or any other code i can modify? thx a lot!