How to upload large files using POST method in Python?

Question

I'm uploading a large file (about 2GB) to an API that accepts POST method using requests module of Python, which results in loading the file to the memory first and increasing memory usage significantly. I believe there will be some other ways to stream the file to the API without burdening the memory. Any suggestions?

P.S.
This old way worked for me, but consumed too much memory.

file = {'file': open(path, 'rb')} requests.post(url, files = file)

Below streaming way sees no memory gorged but returns code 400 from the server.

requests.post(url,data=open(path, 'rb'))

@Anentropic Please see my latest edit just now to help me cope with the issue effectively. Thank you! — Baytars
– Baytars, Commented Jul 8, 2022 at 13:26

Daweo · Accepted Answer · 2022-07-08 12:25:08Z

3

Any suggestions?

Use Streaming Upload, as docs put it:

Requests supports streaming uploads, which allow you to send large streams or files without reading them into memory. To stream and upload, simply provide a file-like object for your body:
with open('massive-body', 'rb') as f: requests.post('http://some.url/streamed', data=f) 

answered Jul 8, 2022 at 12:25

Daweo

38.2k3 gold badges17 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Baytars Over a year ago

Now I know where the memory issue lies: I put the file object in a dict: file = {'file': open(path, 'rb')} and then posted: requests.post(url,files = file). If putting the file object directly into the post data as you wrote, I run into no issue. Thank you!

Baytars Over a year ago

I'm sorry but if I don't upload the file in the file = {'file': open(path, 'rb')} way, the server will respond with code 400. I have updated my question to reflect this feedback.

Anentropic · Accepted Answer · 2022-07-08 14:46:55Z

When you pass files arg then requests lib makes a multipart form upload. i.e. it is like submitting a form, where the file is passed as a named field (file in your example)

I suspect the problem you saw is because when you pass a file object as data arg, as suggested in the docs here https://requests.readthedocs.io/en/latest/user/advanced/#streaming-uploads then it does a streaming upload but the file content is used as the whole http post body.

So I think the server at the other end is expecting a form with a file field, but we're just sending the binary content of the file by itself.

What we need is some way to wrap the content of the file with the right "envelope" as we send it to the server, so that it can recognise the data we are sending.

See this issue where others have noted the same problem: https://github.com/psf/requests/issues/1584

I think the best suggestion from there is to use this additional lib, which provides streaming multipart form file upload: https://github.com/requests/toolbelt#multipartform-data-encoder

For example:

from requests_toolbelt import MultipartEncoder import requests encoder = MultipartEncoder( fields={'file': ('myfilename.xyz', open(path, 'rb'), 'text/plain')} ) response = requests.post( url, data=encoder, headers={'Content-Type': encoder.content_type} )

Yes, at length I found the same lib as yours and tested it out. It worked like a charm! This seemingly simple question led me to go a long and complex journey. Thank you!

Collectives™ on Stack Overflow

How to upload large files using POST method in Python?

2 Answers 2

2 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Linked

Related