Memory issue with std::vector in c++

Question

I am having a memory issue with std:: vector in c++. Here is my code:

#include <iostream> #include <vector> int main () { std::vector< std::vector<float> > mesh_points_A; int N=10; for(int i=0;i<N;i++){ for(int j=0;j<N;j++){ std::vector<float> xyz; xyz.push_back(i); xyz.push_back(j); xyz.push_back(0.3); mesh_points_A.push_back(xyz); } } return 0; }

When I increase N to 10000 or higher I am running out of memory... But I think that I am doing something totally wrong, because if I would for example use python with numpy arrays this would be possible easily...

Many thanks in advance.

EDIT: This is the original code. The above code was just a simplification to exemplify the problem better. The question is, if it is somehow possible to create many Surface objects (in the code there are currently two) without running out of memory while keeping N=10000.

// classes example compile with c++ -o Surface Surface.cpp -std=c++11 #include <iostream> #include <vector> #include <array> class Surface { private: std::vector< std::array<float,3> > mesh_points_A; public: float R; float z; // z position if the suface int n_A; //number of area mesh points mesh_points_A.size() Surface(int nxA, float R , float z); }; Surface::Surface(int nxA, float R,float z) : z(z) , R(R) { float dxA= 2*R/(nxA*1.0-1.0); //determine n_A, n_A=0; for(int i=0;i<nxA;i++){ float x = -R+i*dxA; for(int j=0;j<nxA;j++){ float y = -R+j*dxA; if(x*x+y*y<R*R){ n_A+=1; } } } std::cout<<"Number of area mesh points: "<<n_A<<std::endl; mesh_points_A.reserve(n_A); for(int i=0;i<nxA;i++){ float x = -R+i*dxA; for(int j=0;j<nxA;j++){ float y = -R+j*dxA; if(x*x+y*y<R*R){ std::array<float,3> xyz{ {x,y,z} }; mesh_points_A.push_back(xyz); } } } } int main () { int N= 20000; Surface s1(N,0.1,0.0); Surface s2(N,0.1,0.1); return 0; }

Konrad Rudolph · Accepted Answer · 2018-09-17 15:41:08Z

Your vector needs to successively reallocate more memory to keep growing. It does this by reserving a new, larger area of memory and copying the old data over. It depends on the implementation how much more memory is reserved, but a typical strategy is to allocate twice as much memory (libstdc++ does this).

This means that, in the worst case, your total memory requirement could be close to three times as much as your raw memory requirement:

Let’s say your vector currently holds 90,000,000 elements, and its capacity — by bad luck — is also 90,000,000¹. To insert the 90,000,001st element, std::vector now reserves twice as much memory — 180,000,000, copies all the old elements over, and then destructs the old array.

Therefore, even though you “only” need 100,000,000 elements, you briefly had to allocate storage for 270,000,000 elements. This corresponds to around 9.10 GiB, even though your 100M vector only requires 3.35 GiB.

This can be neatly avoided by putting the following line in front of your nested initialisation loop:

mesh_points_A.reserve(N * N);

¹ More realistically, the capacity is probably a power of two, e.g. 2²⁶ = 67,108,864; that’s still 6.75 GiB of memory for the resizing.

Thanks for this helpful answer. I incorporated this together with the suggestion to use std::array for the 3 dimension array. I now have no longer problems when I choose N=10000, but at 20000 I seem to run out of memory again. I check against numpy and it also runs out of memory at N=20000. So this should be fine. I think I just have to go to a larger computer when I want to increase N. One last question: Actually my mesh_points_A is a member vector of a class. When I create more than one object, is there any way to elude the fact that I cannot create many objects with a 10000 member vector?
I will add my original code with mesh_points_A as a member vector to the original post.

Slava · Accepted Answer · 2018-09-17 15:45:20Z

std::vector has a flexibility of dynamically change size on your need. As always flexibility has a price. Usually that price is small and can be easily ignored though in this case when you use std::vector<float> vs std::array<float,3> difference is very significant as you have 100 million of elements. For example if we run this code:

 std::vector<float> v; for( auto f : { 1.0, 2.0, 3.0 } ) v.push_back(f); std::cout << sizeof(v) << "-" << v.capacity() << std::endl; std::cout << sizeof(std::array<float,3>) << std::endl;

live example

we can see that on this platform std::vector<float> itself takes 24 bytes plus it dynamically allocate memory for 4 floats - 16 bytes vs just 3 floats - 12 bytes if you would use fixed size structure. So in your case difference would be:

1 std::vector - ( 24 + 16 ) * 100 000 000 = 4 000 000 000 2 std::array - 12 * 100 000 000 = 1 200 000 000

2 800 000 000 or around 2Gb of memory.

But that not the end of it std::vector has another price - it must allocate all data in continuous space. Usually it is done be reallocating capacity when size reaches current one. In this case it means that memory need for creating this data can be easily more than doubled - let's say if capacity reaches 50 million and vector needs to reallocate, it would create another block of memory for lets say 100 million while keeping previous one (so you memory has to hold 150 million elements) and copy them. And that without memory fragmentation issue.

So recommended solution is to have std::array<float,3> for inner data (or struct with 3 elements) and either std::deque as outer container or if you have to use std::vector allocate memory for enough elements in advance.

Collectives™ on Stack Overflow

Memory issue with std::vector in c++

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related