How do I write a futures::Stream to disk without storing it entirely in memory first?

Question

There's an example of downloading a file with Rusoto S3 here: How to save a file downloaded from S3 with Rusoto to my hard drive?

The problem is that it looks like it's downloading the whole file into memory and then writing it to disk, because it uses the write_all method which takes an array of bytes, not a stream. How can I use the StreamingBody, which implements futures::Stream to stream the file to disk?

for x in stream { file.write_all(&x) } something like that... — Stargateur
– Stargateur, Commented Nov 11, 2018 at 4:04
That would require StreamingBody to be an iterator, which it is not. — Nicholas Bishop
– Nicholas Bishop, Commented Nov 11, 2018 at 4:09

Shepmaster · Accepted Answer · 2018-11-12 14:12:37Z

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25 use futures::{prelude::*, stream}; type Error = Box<std::error::Error>; fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> { const DUMMY_DATA: &[&[u8]] = &[b"0123", b"4567", b"89AB", b"CDEF"]; let iter_of_owned_bytes = DUMMY_DATA.iter().map(|&b| b.to_owned()); stream::iter_ok(iter_of_owned_bytes) }

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::{fs::File, io::Write}; fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> { streaming_body().for_each(move |chunk| file.write_all(&chunk).map_err(Into::into)) }

We can then write a little testing main:

fn main() { let mut file = Vec::new(); { let fut = save_to_disk(&mut file); fut.wait().expect("Could not drive future"); } assert_eq!(file, b"0123456789ABCDEF"); }

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.
The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

Collectives™ on Stack Overflow

How do I write a futures::Stream to disk without storing it entirely in memory first?

1 Answer 1

8 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Linked

Related