3

I'm trying to figure out some basic stuff in Rust.

I would like to create a tool that will read 512 bytes from a file, and copy those bytes to another file. Then take next 8 bytes from input file and skip them. Then take next 512 bytes from input file, and copy them to output file, then skip 8 bytes, etc...

I need this tool to be fast, so I can't just perform an I/O call every 512 bytes. I've figured I would need to read a few megabytes of the input file first, then remove unneeded 8-byte blocks in memory by selectively copying it to another memory block, and then call I/O write to dump the bigger memory block at once.

So, I would like to do something like this (pseudo-code):

let buffer = buffer of 'u8' of size 4MB; let buffer_out = buffer of 'u8' of size 4MB; // both buffers above take 8MB of memory let input_stream = InputStream(buffer); let output_stream = OutputStream(buffer_out); for(every 4MB block in the input file) { input.read(buffer); // read the 4MB block into 'buffer' input_stream.seek(0); // reset the input stream's cursor to offset 0 for(every 520 byte inside the 4MB block in 'buffer') { output_stream.write(input_stream.read(512)); // copy important 512 bytes input_stream.read(8); // skip superfluous 8 bytes } output.write(buffer_out); } 

The problem in Rust I have is that I'm trying to use Cursor object to implement streaming access to both buffers. For example, I'm allocating the buffer on the heap like this:

let mut buf: Box<[u8; BUF_SIZE]> = Box::new([0; BUF_SIZE]); 

And then I'm creating a Cursor to access this array in a streaming mode:

let mut rd_cursor: Cursor<&[u8]> = Cursor::new(buf.as_slice()); 

However, I have no idea how to read the data from the input file now. buf is used by the Cursor, so I can't access it. In C++ I would just read the data to buf and be done with it. And Cursor doesn't seem to implement anything that can be used directly by BufReader.read(), which I use to read data from the input file.

Perhaps I could make it work by creating another buffer, read data from 'input' to the temporary buffer, from temporary buffer to 'buf' through the Cursor, but that would result in constant recopying of memory, which I would like to avoid.

I can see there is a fill_buf function in Cursor, but it seems that it returns only a readonly reference to underlying buffer, so I can't modify the buffer, thus it's useless for my case.

I have also tried using BufReader instead of Cursor. Here is my second try:

let mut rd_cursor: BufReader<&[u8]> = BufReader::new(&*buf); 

BufReader<R> contains get_mut returning R, so I think it should return &[u8] in my case, which sounds like a good thing. But by using &[u8], get_mut complains that I need to pass a mutable thing as R. So I'm changing it like this:

let mut rd_cursor: BufReader<&mut [u8]> = BufReader::new(&mut *buf); 

But Rust won't let me:

src\main.rs|88 col 47| 88:61 error: the trait `std::io::Read` is not implemented for the type `[u8]` [E0277] || src\main.rs:88 let mut rd_cursor: BufReader<&mut [u8]> = BufReader::new(&mut *buf); 

Could anyone please hit me in the head to fix my understanding of what is happening here?

3
  • 3
    you are aware that BufReader already buffers? You can simply set the capacity to some megabytes and then work on your 512 byte + 8 byte read cycle. Commented Feb 20, 2015 at 8:22
  • also, you cannot read an unsized array, as rust doesn't know how many bytes you want. I'm not sure if you can do &mut [u8, BUF_SIZE] but you'd need sth like that. Commented Feb 20, 2015 at 8:23
  • @ker: Actually this should be exactly what I need: changing the design ;). I've deeply over-thought the problem and now I'm having problems because of this. Please copy it as an answer, I'll accept it. Commented Feb 20, 2015 at 9:20

1 Answer 1

3

BufReader already buffers reads. To quote the docs:

Wraps a Read and buffers input from it

It can be excessively inefficient to work directly with a Read instance. For example, every call to read on TcpStream results in a system call. A BufReader performs large, infrequent reads on the underlying Read and maintains an in-memory buffer of the results.

You could simply set the capacity to some megabytes and then work on your 512 + 8 byte read cycle. The BufReader will only do an actual system call when you used up the buffer.


The following error

error: the trait std::io::Read is not implemented for the type [u8] [E0277]

is due to the fact, that rust doesn't know how many bytes you want. [u8] is an unsized array. I'm not sure if you can do &mut [u8, BUF_SIZE] but you'd need something along those lines

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.