1

I found something surprising with std::vector that I thought I'd ask about here to hopefully get some interesting answers.

The code below simply copies a string into a char vector and prints the contents of the vector in two ways.

#include <vector> #include <string> #include <iostream> int main() { std::string s("some string"); std::vector<char> v; v.reserve(s.size()+1); // copy using index operator for (std::size_t i=0; i<=s.size(); ++i) v[i] = s[i]; std::cout << "&v[0]: " << &v[0] << "\n"; std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n"; // copy using push_back for (std::size_t i=0; i<=s.size(); ++i) v.push_back(s[i]); std::cout << "&v[0]: " << &v[0] << "\n"; std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n"; return 0; } 

Building and running this yields:

$ g++ main.cpp -o v && ./v &v[0]: some string begin/end: &v[0]: some string begin/end: some string 

My expectation was that it would print the string correctly in both cases, but assigning character by character using the index operator doesn't print anything when later using begin() and end() iterators.

Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?

Is there a reasonable explanation for this behaviour? :)

I've only tried this with gcc 4.6.1 so far.

5
  • I'm not sure if it's relevant to your problem, but is the last iteration of your copy loop going to be ... s[s.size()]? I get that this is to copy the nul byte, but you aren't guaranteed to be able to get it this way, it only has to work for the actual characters in the string. s.c_str()[s.size()] would work, as that has to be a nul-terminated c string. Commented Feb 1, 2012 at 12:26
  • ¤ In addition to the trivial answer of using resize, do use c_str() to guarantee a nullbyte at the end of the string. Or use some other means to guarantee it. Unless you're coding for C++11 and is absolutely certain that the compiler complies with C++11 on this (and also, that I'm not remembering the C++11 rules incorrectly). Cheers & hth., Commented Feb 1, 2012 at 12:29
  • btw. this is a useful gotw on this: gotw.ca/gotw/074.htm Commented Feb 1, 2012 at 12:49
  • We had such a question only yesterday: Be careful of vector::reserve? Commented Feb 1, 2012 at 13:06
  • @AlfP.Steinbach: You are remembering correctly: C++11 requires that s[s.size()] gives a reference to a null character. C++03 was a bit weirder, only requiring that for the const overload (which isn't used being used in this case), and from c_str() as you say. Commented Feb 1, 2012 at 13:22

2 Answers 2

11

Typical example of Undefined Behavior.

You are only ever allowed to access elements by index (using operator[]) between 0 and v.size()-1 (included).

Using reserve does not modify the size, only the capacity. Would you have used resize instead, it would work as expected.

Sign up to request clarification or add additional context in comments.

3 Comments

Heh, of course you're right. That fixed it. Thanks for the quick answer.
Well the access of the vector buffer is not very undefined when you get down to it, but that's a longer discussion. The expectation of a nullbyte at the end of the string is, however, misplaced for C++03. Note that C++03 does guarantee a nullbyte result at the end for the const operator[]. As I recall C++11 also guarantees it for non-const. It's a rather surprising subtlety. Cheers,
@AlfP.Steinbach: You have a keen eye. It had not occurred to me that there were actually another instance of undefined behavior there!
5

In the first case, you have undefined behaviour. reserve sets the capacity, but leaves the size as zero. Your loop then writes to invalid locations beyond the end of the vector. Printing using the (invalid) pointer appears to work (although there is no guarantee of that), since you've written the string to the memory that it points at; printing using the iterator range prints nothing, because the vector is still empty.

The second loop correctly increases the size each time, so that the vector actually contains the expected contents.

Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?

[] is intended to be as fast as possible, so it does no range checking. If you want a range check, use at(), which will throw an exception on an out-of-range access. If you want to resize the array, you have to do it yourself.

1 Comment

+1 for at, beginners (strike that) developers should be recommended to use it!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.