I am new to CUDA and met a problem when writing a singleton/global variable using CUDA. The singleton allocates some cuda memory and tries to free it in the destructor. However, the destructor crashes with cudaError of 29 "driver shutting down".
By some search, I notice the reason might be that the singleton destructor is called after the program exits, when CUDA is already shutdown.
https://github.com/NVlabs/SASSI/issues/4 This link reports a similar issue when the cuda function is called in the destructor of a static member.
https://devtalk.nvidia.com/default/topic/457922/cudafree-crash-in-destructor-when-exit-is-called/ This link reports same question and an unclear solution.
Honestly I do not have much CUDA knowledge so I would like to ask for some detailed explanation and formal solution for this problem.
EDIT:
Thanks to @Robert Crovella's remind I did some tests to reproduce the problem. OK, I found that this problem happens in both singleton and global variables of a std::unordered_map or std::map which call cuda in its value object's destructor.
Working code, no std::map is used:
#include <iostream> #include <map> #define CUDA_CHECK(x) std::cerr << (x) << std::endl; class cuda_user { char* data; public: cuda_user() { std::cerr << "constr" << std::endl; CUDA_CHECK(cudaMalloc((void**)&data, 1024)); } void foo() { std::cerr << "foo" << std::endl; }; ~cuda_user() { std::cerr << "destr" << std::endl; CUDA_CHECK(cudaFree(data)); } }; cuda_user cu; int main() { cu.foo(); } outputs:
constr 0 foo destr 0 Crashed code, with same cuda_user clas, but std::map used:
#include <iostream> #include <map> #define CUDA_CHECK(x) std::cerr << (x) << std::endl; class cuda_user { char* data; public: cuda_user() { std::cerr << "constr" << std::endl; CUDA_CHECK(cudaMalloc((void**)&data, 1024)); } void foo() { std::cerr << "foo" << std::endl; }; ~cuda_user() { std::cerr << "destr" << std::endl; CUDA_CHECK(cudaFree(data)); } }; std::map<int, cuda_user> map; int main() { map[1].foo(); } Outputs:
constr 0 foo destr 29 << Error! Update:
I am using gcc48 and nvcc75 on a CentOS 6.3
int main() { { SingletonController singcont; /* .... */ } return 0; }If you have threads, you should also make sure they end before you leave the inner scope block in main. Then, the descructor of theSingletonControllerwill have no races with other threads and you have a clean shutdown. Of course, most who do singletons, do not do it that way...mainscope will lead to problematic behavior. See here. Although that description mostly focuses on kernel calls in such a class/object, the hazard applies to any CUDA call, as you have discovered. I would suggest your question is arguably a duplicate of that one, at least insofar as you are requesting a description of what is happening, and now that your example is devoid of any singleton character.