Data race when catching by non-const reference
People say “always catch exceptions by const reference,” which I think is good advice. I’ve worked on at least one codebase that catches by non-const reference, presumably just because it’s shorter to type. This has never caused any problem for them in practice, as far as I know. But in theory, it’s a bad idea, because in C++ it is possible to construct a program where two different catch-blocks, executing concurrently, actually reference the exact same exception object. In that situation, any mutation to the exception object will cause a data race. The easiest way to prevent the race is to prevent mutation: to catch by const reference!
Here’s my toy test program. We start with our exception object, which has a non-const-qualified method that mutates the object and checks to see if anyone else is mutating it at the same time (because that would mean a data race in a real program):
struct Atomic { std::atomic<int> value_; void detect_racey_accesses() { for (int i=0; i < 10000; ++i) { int last_value = value_.load(); ++value_; int next_value = value_.load(); if (next_value != last_value + 1) { printf("Detected a race: %d != %d+1\n", next_value, last_value); } } } }; (Now, technically, concurrent calls to detect_racey_accesses do not cause data race in the C++ sense, because I’m using an atomic<int>; the data race here is at the logical level but not the physical level. In real code, where value_ would be a plain old int, we would have a physical data race here.)
And here’s the program that throws an Atomic, catches the same object by reference from two different threads, and then detects racey mutation of that object:
int main() { std::promise<std::exception_ptr> prom; std::thread T1( [fut = prom.get_future()]() mutable { std::exception_ptr ex = fut.get(); try { std::rethrow_exception(ex); } catch (Atomic& a) { a.detect_racey_accesses(); // A } } ); std::thread T2( [prom = std::move(prom)]() mutable { try { try { throw Atomic(); } catch (...) { prom.set_value(std::current_exception()); throw; } } catch (Atomic& b) { b.detect_racey_accesses(); // B } } ); T1.join(); T2.join(); } Lines A and B run concurrently, and the object referenced by Atomic& a on line A is the exact same object as the object referenced by Atomic& b on line B.
So, catch your exceptions by const reference — and follow the “const means thread-safe” rule in your own code — and you will avoid this extremely contrived but extremely surprising situation.
This also explains why, in contrived code such as
catch (Foo& f) { return f; } we do not, and cannot, get either copy elision or implicit move on return f — it must make a copy, because the original exception object might still be in use by another thread.
