4

I am trying to download large file from python requests library by setting the stream=True

But i want this function to be executed asynchronously and send response back to server with downloading in the background.

Here is my code

async def downloadFile(url, filename): r = requests.get(url, stream=True) with open(os.path.join('./files', filename), 'wb+') as f: for chunk in r.iter_content(chunk_size=1024): if chunk: f.write(chunk) # Creating same file name # with _done appended to know that file has been downloaded with open(os.path.join('./files', filename + '_done'), 'w+') as f: f.close() await asyncio.sleep(1) 

Calling this function from other function like this

# check if file exist in server if(os.path.exists(os.path.join('./files', fileName))): #file exist!!! #check if done file exist if(os.path.exists(os.path.join('./files', fileName + '_done'))): #done file exist self.redirect(self.request.protocol + "://" + self.request.host + '/files/' + fileName) else: #done file not exist. Wait for 5 min more self.write('Wait 5 min') self.finish() else: # file doesnt exist. Initiate download self.write('Wait 5 min') self.finish() d = asyncio.ensure_future(downloadFile( fileRes, fileName)) # loop = asyncio.get_event_loop() # loop.run_until_complete(d) 

The problem is that the file is created but its size remains 0 and the file appended "_done" is never created. What am I doing wrong here?

3
  • Your code works for me. Perhaps enable debugging for requests? Commented Dec 20, 2018 at 17:21
  • Your code works for me too Commented Dec 20, 2018 at 17:21
  • This code works for the first time only. If i instantly initiate the request it waits for the whole file to download. I have updated my question with details about how download function is called. please check. Commented Dec 20, 2018 at 18:52

1 Answer 1

3

Your code works for me. Maybe is it the resource you're trying to get that does not work.

You might want to try enabling debug for requests as suggested by @brennan, and/or add printouts to your code to follow what's happening:

>>> import requests >>> import asyncio >>> >>> >>> async def downloadFile(url, filename): ... print(f"• downloadFile({url}, {filename})") ... r = requests.get(url, stream=True) ... print(f" → r: {r}") ... with open(os.path.join('./files', filename), 'wb+') as f: ... print(f" → f is opened: {f}") ... for chunk in r.iter_content(chunk_size=1024): ... print(f" → chunk is: {chunk}") ... if chunk: ... f.write(chunk) ... # Creating same file name ... # with _done appended to know that file has been downloaded ... with open(os.path.join('./files', filename + '_done'), 'w+') as f: ... print(f" → creating output with _done") ... f.close() ... print(f" → wait 1") ... await asyncio.sleep(1) ... >>> >>> >>> d = asyncio.ensure_future(downloadFile('https://xxx/yyy.jpg', 'test.jpg')) >>> loop = asyncio.get_event_loop() >>> loop.run_until_complete(d) • downloadFile(https://xxx/yyy.jpg, test.jpg) → r: <Response [200]> → f is opened: <_io.BufferedRandom name='./files/test.jpg'> → chunk is: b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\r\t\n\x0b\n\x08\r\x0b\n\x0b\x0e\x0e\r\x0f\x13....' → chunk is: ... ... → creating output with _done → wait 1 

that would make your _done part of your code useless (you only need the printouts). And even the wait at the end (when it's done… it's done!).

async def downloadFile(url, filename): r = requests.get(url, stream=True) with open(os.path.join('./files', filename), 'wb+') as f: for chunk in r.iter_content(chunk_size=1024): if chunk: f.write(chunk) 

Though maybe you might want to catch any possible issues happening connecting to the server and act accordingly:

async def downloadFile(url, filename): try: r = requests.get(url, stream=True) r.raise_for_status() # to raise on invalid statuses with open(os.path.join('./files', filename), 'wb+') as f: for chunk in r.iter_content(chunk_size=1024): if chunk: f.write(chunk) except requests.RequestException as err: # do something smart when that exception occurs! print(f"Exception has occured: {err}") 
Sign up to request clarification or add additional context in comments.

3 Comments

Actually this works for the first time only. But if i instantly request again it waits for response until the file has been downloaded. Here is the work flow 1- Script search for file in script. If found it returns, if not it starts to download in server after sending response to user "wait 5 min" 2- If file has been download and there exist "_done" file. Then it send the file to user requesting. 3- If file exist but not "_done" file. Then the file must be downloading in server. and send response back to user to wait 5 min.
I have updated my question with more details how downloading is initiated.
ok, TBH, I don't understand what your requirements / what you're trying to achieve… I think you're in a deep case of X-Y Problem. It looks like you're trying to circumvent issues downloading a file by brute force downloading that file until eventually it works. But you can achieve that more elegantly without using the _done file, by using requests statuses (cf the last example in my answer). But the right way depends mainly on why you need watching the file and redownloading.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.