5

The source code

#include <iostream> #include <string> using namespace std; int main(){ std::string s{'a', 'b', '\0', 'c'}; std::string s1="ab\0c"; cout<<s.size()<<" "<<s<<endl; cout<<s1.size()<<" "<<s1<<endl; return 0; } 

and the output is

4 abc 2 ab 

I wonder why this phenomenon occurs and are there any difference between these two types of initialization in C++? Thanks.

7
  • 3
    I can't copy paste pictures Commented Mar 3, 2017 at 15:21
  • 3
    You should put your source code directly in your question in the form of text. Many users won't even look at code in the form of a screen shot. Commented Mar 3, 2017 at 15:22
  • 1
    This is my first time to ask a question and I'm sorry. I have fixed it and thanks again. Commented Mar 3, 2017 at 15:28
  • 1
    It's unfortunate how quick users are to down vote first posts rather than correcting the new users on their mistakes. Commented Mar 3, 2017 at 15:33
  • 2
    a general tip, don't use endl unless you specifically need to have the buffers flushed. Just use "\n" instead. When writing to a terminal console it will usually be line buffered anyway and when writing to some other pipe you do not incur the unnecessary overhead of flushing your buffers every time. Commented Mar 3, 2017 at 15:38

2 Answers 2

8

For s you're matching the constructor that accepts an initialiser-list of characters: that's (9) in the list here. The string class lets you construct strings from arbitrary data which may include embedded NULs, as it does in this case. The initialiser list knows its own length, so the string captures all the characters.

For s1, the matching constructor is (5) in the above-linked list, which accepts a const char* - the compiler lets the array of char provided decay to such a pointer before calling that constructor, which means the constructor has no knowledge of the length of the array. Instead, it assumes you're deliberately using the ASCIIZ NUL-terminated string convention (as in "C" strings), and scans through the data to find the first NUL, considering that the terminator. Consequently, only 2 characters are captured in the string.

Note that you can explicitly capture 4 characters with...

std::string s1 { "ab\0c", 4}; 

...which matches constructor (4) in the list.

Rakete1111's comment below illustrates another, newer way to create such strings: auto s1 = "ab\0c"s;.

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks, you help me understand more. Usually, I don't notice these differences. I should pay more attention to the constructor of STL. And by the way, is that means the second way is similar to char s[ ]="ab\0c" in C?
You're welcome. Regarding char s[ ]="ab\0c" in C - yes and no - s there would capture all 4 characters, but if you passed it to a function expecting a const char*would decay, so e.g. strdup(s) would return a new buffer lacking the final 'c', similar to the string constructor behaviour. Hope that helps clarify things!
Oh, I see. Thanks for cleared up the confusion and I learned a lot!
You could also use string literals: auto s1 = "ab\0c"s;
@Rakete1111: good point - I wish all the compilers I've had to work with recently supported that already!
|
0

The reason is that std::strings are not first class objects, they are standard library objects and have to obey the rules of C++ syntax. Unlike string literals which are first class constructs.

An std::string is allowed embedded nuls, so when it is initialised to an array, it sees the whole of the array. When it is initialised to a string literal, it sees a char *, and its only way of detecting string end is to search for the null.

3 Comments

The reason has nothing to do with whether std::strings are "first class objects" or obey rules of "C syntax" (which they certainly do not).
Interestingly, C++ actually provides end-user and library code enough power to capture the length of a string literal constructor argument - e.g. template <size_t N> string(char (&arg)[N]) { ...can use N in body... };, It's the std::string API that chooses not to use that approach.
FYI: {'a', 'b', '\0', 'c'} is not an array. It is a std::initialization_list<char>. Technically a string literal is an array. It has the type of const char[N].

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.