Avoid C-Arrays! Avoid C-Arrays! Avoid Arrays!
Why?
int final_score (int const* all, int n);
Does the function also take ownership of the array (delete it in the end) or will it just read the values?
point* get_random_points (int howmany);
Does this create a new array that must be deleted manually afterwards? Or is the memory managed by something else, e.g., a memory pool or external library?
- length information not part of the array
- need two variables/function parameters for memory location and length
- independent pointer & length with potentially inconsistent values
- Heartbleed -like read/write-past-array-end situations
Unlike proper C++ containers, arrays
- are not deeply copyable or copy-assignable
- (if allocated on the heap) don't clean up memory by themselves
- don't support insertion, resizing, …
C++'s de-facto default container
#include <vector>
std::vector<int> w {1,3,4}; cout << w[1] << '\n'; // prints 3 for (int x : w) { /* loop over elements */ }
w.push_back(2); // w: {1,3,4,2} w.insert(w.begin(), 8}); // w: {8,1,3,4,2}
auto s = w.size(); // s: 5 if (!w.empty()) { … }
// pass to C library c_lib_fn(w.data(), w.size()); - overhead-free random access
- fast traversal; good for linear searches
- insertion at end in amortized constant time
- potentially slow if insert/erase operations at random positions dominate
- slow if element type has high copy/assignment cost
- potentially long allocation times for very large amount of values (can be mitigated, see here )
#include <array> std::array<int,4> a {0,3,7,9}; cout << a[2] << '\n'; // prints 7 auto s = a.size(); // s: 4 if (!a.empty()) { … } for (int x : a) { /* loop over elements */ }
// allocate on heap auto ah = make_unique<array<int,1000000>(); - overhead-free random access
- fast traversal; good for linear searches
- knows its size
- size is fixed at compile time
#include <deque>
std::deque<int> d {1,3,4}; cout << d[1] << '\n'; // prints 3 for (int x : d) { /* loop over elements */ }
d.push_back(2); // w: {1,3,4,2} d.pop_front(); // w: {3,4,2} d.push_front(8); // w: {8,3,4,2}
auto s = d.size(); // s: 4 if (!d.empty()) { … } - constant-time random access (extremely small overhead)
- fast traversal; good for linear searches
- good insertion or deletion performance; especially at begin & end
- slow if element type has high copy/assignment cost
- potentially long allocation times for very large amount of values (can be mitigated, see here )
#include <deque>
std::list<HeavyWeight> ls {1,3,4}; cout << ls[1] << '\n'; // prints 3 for (auto const& x : ls) { /* loop over elements */ } if (!ls.empty()) { … }
std::list<HeavyWeight> ks {7,8,9}; ls.splice(ls.begin()+1, std::move(ks)); // ls: {1, 7,8,9, 3,4} ks: {} - constant-time splicing (of complete lists)
- restructuring operations don't require elements to be moved or copied (good for storing large objects with high copy/assignment cost)
- random access only in linear time
- slow traversal due to bad memory locality (can be mitigated to some extend by using custom allocators)
#include <span> - lightweight (= cheap to copy, can be passed by value)
- non-owning view (= not responsible for allocating or deleting memory)
- of a contiguous memory block (of e.g.,
std::vector, C-array, …)
span<int> | sequence of integers whose values can be changed |
span<int const> | sequence of integers whose values can't be changed |
span<int,5> | sequence of exactly 5 integers (number of values fixed at compile time) |
#include <span>
int final_score (std::span<int const> scores);
vector<int> w {0, 1, 2, 3, 4, 5, 6}; |<--------->| int a = final_score({w.begin()+2, 5}); // sub-range int b = final_score(w); // view all of w - overhead-free random access
- knows its size
- function interfaces with well-expressed intent
- use in function interfaces
- avoid for local variables (danger of dangling references)
class Block { public: explicit Block(std::initializer_list<int>); … };
Block::Block(initializer_list<int> l): … { if (l.size() == 2) { … } for (int x : l) { /* loop over values */ } }
Block b1 {2}; Block b2 {2,4,6,8}; - variadic number of values
- a lot safer and simpler than variadic templates
- only values of same type possible
- can't be used with non-copyable/move-only types
- problematic in performance-critical code
Situation:
need very large uninitialized contiguous memory block of int/double, e.g., as target for GPU results
Problem:
std::vector value-initializes its underlying memory block, which means that, e.g., ints are initialized with 0 ⇒ slow (up to minutes for multi-gigabyte buffers)
Solutions
At Least Isolate Them! At Least: Isolate! At Least Isolate
If you need to interface with C code, at least isolate array handling from the rest of the code using a span .
int final_score (std::span<int const> in) { if (in.empty()) return NO_SCORE; … }
int* p = c_lib_get_scores(); int len = c_lib_get_num_scores(); std::span<int> scores {p, len}; for (int s : scores) { … } int fs = final_score(scores); span isolates array handling from implementation of final_score
If you think you really need to use a C-array
- allocate with
std::make_unique⇒unique_ptrowns memory - allocate with C++20's
std::make_unique_for_overwrite, if you want to avoid expensive value-initialization (e.g., avoid initialization ofints with0) - use a span to access / pass the array around
#include <memory> SampleStats statistics (Samples const& in) { // make uninitialized array auto buf = std::make_unique_for_overwrite<int[]>(in.size()); // obtain view to it: std::span<int> results {buf.get(), in.size()}; // compute on GPU gpu_statistics(in, results); prefix_sum(results); … } // memory automatically deallocated
Comments…