Efficient way to reuse buffer

Question

I have a piece of code that is reading into a buffer of constant size, which can be simplified to basically

{ constexpr std::size_t buffer_size = 262144u; std::vector<char> buffer(buffer_size); // vector is reallocated every time read(buffer.data(), buffer.size()); // write into the buffer ... }

For performance I was considering making the vector static to reduce re-allocating every time this function is called, something like this

{ constexpr std::size_t buffer_size = 262144u; static std::vector<char> buffer(buffer_size); // vector is allocated only once buffer.assign(buffer.size(), '\0'); // "reset" the vector contents read(buffer.data(), buffer.size()); // write into the buffer ... }

I tried benchmarking these two methods and they appear to be nearly identical (for Clang with -O3) in terms of performance.

Is this linked benchmark a good test for the behavior I described?
Are one of these two methods "obviously" preferred over the other, from a performance perspective?

preferred over the other, from a performance perspective? from performance (and memory) perspective, i've seen projects using only global variables. But there are problems with global variables and don't forget rules of optimization. It's funny how, in your benchmark, doing the memset(buffer, '\0' part takes way longer time then calling malloc. If you lower the buffer size in your benchmark, you'll see the difference. Still, in your benchmark, static vector is faster. — KamilCuk
– KamilCuk, Commented Oct 16, 2020 at 19:12
Do you need to clear the vector, if you are immediately going to read data into it? — catnip
– catnip, Commented Oct 16, 2020 at 19:40
The function that will write to the buffer (read in my example) is a black-box 3rd party code, so I'm not sure what the exact requirements are. E.g. can it be full of arbitrary data, should it be "zeroed out", etc. I think that's a great point though, if I can skip writing to the whole buffer that should be a clear winner. — Cory Kramer
– Cory Kramer, Commented Oct 16, 2020 at 19:43
You are measuring both allocating elements and initializing them. If you want to compare just the time to allocate the memory, it might be better to skip the buffer.assign() line, use the default constructor for vector, and reserve the space. Of course, don't call read either since the memory is allocated but not initialized, but your benchmark already doesn't involve read. This might tell you that the time to initialize the data dwarfs the time to allocate it. — JaMiT
– JaMiT, Commented Oct 16, 2020 at 19:45

Dest987 · Accepted Answer · 2020-10-16 19:51:32Z

The issue is that you're not really "reusing" your vector since you're zeroing out the whole thing. If you're going to write into the buffer anyways, then you don't have to zero it out.

This could potentially be an issue if you're writing the entire vector out somewhere that a user could see it since you would potentially leak part of whatever was in the array before, but it's somewhat rare that that's an issue.

Here's your benchmark without the clear and instead just writing in 1024 bytes to simulate file reading. https://quick-bench.com/q/Qkb0NTMAQ0kaydKbdpOPhQGgR6U

The purpose of reusing the buffer is to avoid memory allocation. This is much more expensive than zeroing it.

Collectives™ on Stack Overflow

Efficient way to reuse buffer

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related