3

OK, so: we all know that generally the use of const_cast<>() anywhere is so bad it’s practically a programming war crime. So this is a hypothetical question about how bad it might be, exactly, in a specific case.

To wit: I ran across some code that did something like this:

std::string temporary = "/tmp/directory-XXXXX"; const char* dtemp = ::mkdtemp(const_cast<char*>(temporary.c_str())); /// `temporary` is unused hereafter 

… now, I have run across numerous descriptions about how to get writeable access to the underlying buffer of a std::string instance (q.v. https://stackoverflow.com/a/15863513/298171 for example) – all of them have the caveat that yes, these methods aren’t guaranteed to work by any C++ standard, but in practice they all do.

With this in mind, I am just curious on how using const_cast<char*>(string.c_str()) compares to other known methods (e.g. the aforementioned &string[0], &c)… I ask because the code in which I found this method in use seems to work fine in practice, and I thought I’d see what the experts thought before I attempt the inevitable const_cast<>()-free rewrite.

4
  • 1
    Use std::vector<char> instead of std::string. You have the writable buffer without all of the casting. Commented Dec 7, 2015 at 5:29
  • Not what I am asking – I can think of a bunch of ways to do it that don’t have the problem. I am curious about the nature of what makes the problem problematic. Commented Dec 7, 2015 at 5:30
  • 1
    Btw: A correct implementation would be this: char temporary[] = "/tmp/directory-XXXXX"; char* dtemp = mkdtemp(temporary); Since this creates an array on the stack, which is initialized with a copy of the string literal, you have all right to modify that copy, including passing it to mkdtemp(). True, no shiny C++ stuff used here, but it's shorter, simpler, and more correct than the version you ran across :-) Commented Dec 7, 2015 at 21:25
  • @cmaster absolutely yes – while that would be saner and less circuitous, the snippet I posted is a serious oversimplification of the actual production code, which naturally is messy enough to make such a straightforward edit unstraightforwardly doable. But you are certainly correct about doing a static assignment to a writeable stack-based temporary as ideal fodder for mkdtemp() usage, indeed. Commented Dec 8, 2015 at 17:18

1 Answer 1

3
  • const cannot be enforced at hardware level because in practice, in non-hypothetical environment, you can set read-only attribute only to a full 4K memory page and there are huge pages on the way, which drastically reduce CPU's lookup misses in the TLB.

  • const doesn't affect code generation like __restrict from C99 does. In fact, const, roughly speaking, means "poison all write attempts to this data, I'd like to protect my invariants here"

Since std::string is a mutable string, its underlying buffer cannot be allocated in read-only memory. So const_cast<> shouldn't cause program crash here unless you're going to change some bytes outside of underlying buffer's bounds or trying to delete, free() or realloc() something. However, altering of chars in the buffer may be classified as invariant violation. Because you don't use std::string instance after that and simply throw it away this shouldn't provoke program crash unless some particular std::string implementation decide to check its invariants' integrity before destruction and force a crash if some of these are broken. Because such check couldn't be done in less than O(N) time and std::string is a performance critical class, it is unlikely to be done by anyone.

Another issue may come from Copy-on-Write strategy. So, by modifying the buffer directly you may break some other std::string's instance which shares the buffer with your string. But few years ago majority of C++ experts came to conclusion that COW is too fragile and too slow especially in multi-threaded environments, so modern C++ libraries shouldn't use it and instead adhere to using move construction where possible and avoiding heap traffic for small length strings where applicable.

Sign up to request clarification or add additional context in comments.

17 Comments

Well, I think this is a bit misleading. It's not permitted to cast away the const and then write to the buffer through that pointer. You can only read. To get write access to the string contents, use &s[0].
@fish2000 it causes undefined behaviour, which may include appearing to work as intended. Or to put it another way, compilers may or may not support it (and you don't get any warning if they don't)
a) const cannot be enforced at hardware level b) const doesn't affect code generation like __restrict from C99. In fact, const, roughly speaking, means "poison all write attempts to this stuff, this hopefully protects my invariants".
About checking invariants in the destructor: std::string::size() is defined to be of O(1) complexity, so all implementations need to keep either the size or a pointer to the end of the string around. That information is enough to do a sanity check of the null-byte at the end of the string in constant time. So an implementation could force a crash if the terminating byte is overwritten.
@fish2000 I see it this way: The trailing NUL byte is put there implicitly by the implementation, and cannot be removed by legal user code. Hence, if the NUL byte is overwritten, the program has already entered UB. Aborting the process is a perfectly permissible course of action in such a case, and the check is cheap. True, that's pretty much of a language-lawyer point of view, but I've come to the conclusion that it's better to listen to the language-lawyers when it comes to UB. There are too many cases where code that used to work well broke miserably because a compiler proved UB.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.