12

I am having problems with std::string..

Problem is that '\0' is being recognized as end of the string as in C-like strings.

For example following code:

#include <iostream> #include <string> int main () { std::string s ("String!\0 This is a string too!"); std::cout << s.length(); // same result as with s.size() std::cout << std::endl << s; return 0; } 

outputs this:

7 String! 

What is the problem here? Shouldn't std::string treat '\0' just as any other character?

8
  • 3
    Why is it a problem? Commented Mar 18, 2017 at 12:59
  • 1
    Because c++ string shouldn't be null terminated (i think) and should be treated like any other character Commented Mar 18, 2017 at 13:01
  • Yes, but that does not answer my question. :) Why do you have a \0 in the string in the first place? It's not a printable character anyway. Commented Mar 18, 2017 at 13:02
  • 1
    Because I didn't make the string. For example if someone passes an array of chars to a function that takes string as a parameter (for example std::string mystring(std::string s) {...} and if someone to this function passes something like this: mystring("String!\0 This is a string too!"). length of that string will stop at \0 Commented Mar 18, 2017 at 13:13
  • 1
    Possible duplicate of How do you construct a std::string with an embedded null? Commented Mar 18, 2017 at 14:36

7 Answers 7

17

Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?

You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)

#include <string> #include <iostream> int main() { auto& str = "String!\0 This is a string too!"; std::string s(std::begin(str), std::end(str)); std::cout << s.size() << '\n' << s << '\n'; } 

Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d

Edit: @Rakete1111 reminded me about string literals:

using namespace std::literals::string_literals; auto str = "String!\0 This is a string too!"s; 
Sign up to request clarification or add additional context in comments.

6 Comments

Why not use string literals? auto str = "String!\0 This is a string too!"s;
This answer is closest to what I was looking for! thanks :)
The interesting corollary question is why isn't the std::string constructor overloaded to work with string literals to handle embedded nul chars? From your answer it is clear that the compiler itself is obviously not in the least confused about where the string ends, otherwise std::end would give an incorrect answer too. It is only when going to the standard library that the information gets downgraded to a simple const char * and therefore lost.
@user4815162342 I think, it is to prevent problems like "why my string is full of garbage characters?". When I hear char array, I think buffer. In most cases you do not fill whole buffer: usually it is created large enough to contain any string written in it and then passed to some function, which fills it. So, I believe, there was more demand for treating char arrays like c-strings.
@user4815162342: That's an excellent question. All it would take is a simple template <int Size> basic_string(CharT const (&array)[Size]);. I'm sure there's a good reason, though. Strangely enough, I cannot find any discussion about this anywhere on SO.
|
4

Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:

basic_string( const CharT* s, const Allocator& alloc = Allocator() ); 

The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.


If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.

Comments

3

You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:

{'f', '\0', 'o', '\0'} 

The string constructor taking a char const* will be called, and will be implemented something like this:

string(char const* s) { auto e = s; while (*e != '\0') ++e; m_length = e - s; m_data = new char[m_length + 1]; memcpy(m_data, s, m_length + 1); } 

Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.

If you want to ignore the extra '\0', you can use a std::string literal:

#include <iostream> #include <string> int main () { using namespace std::string_literals; std::string s("String!\0 This is a string too!"s); std::cout << s.length(); // same result as with s.size() std::cout << std::endl << s; return 0; } 

Output:

30 String! This is a string too! 

Comments

1

\0 is known as a terminating character so you'll need to skip it somehow.

String represntation

Take that as an example.

So whenever you want to skip special characters you would like to use two backslashes "\\0"

And '\\0' is a two-character literal

 std::string test = "Test\\0 Test" 

Results :

 Test\0 Test 

Most beginners also make mistake when loading eg. files :

 std::ifstream some_file("\new_dir\test.txt"); //Wrong //You should be using it like this : std::ifstream some_file("\\new_dir\\test.txt"); //Correct 

3 Comments

Your results don't match. This would produce Test\0 Test.
About your file example: correct way would be to just write /new_dir/test.txt. Majority of Windows subsystems works just fine with forward slashes in path.
@Revolver_Ocelot Yeah, but not everybody does it that way so mistakes can happen.
0

In very few words, you're constructing your C++ string from a standard C string.

And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"

And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.

Comments

-1

That's not a problem, that's the intended behavior.

Maybe you could elaborate why you have a \0 in your string.

Using a std::vector would allow you to use \0 in your string.

1 Comment

std::string is just fine with '\0' byte in the string, no need to use std::vector just because of it.
-2

Escape your \0

std::string s ("String!\\0 This is a string too!"); 

and you will get what you need:

31 String!\0 This is a string too! 

2 Comments

The OP wants their string to contain a NUL character, not a backslash followed by a 0.
It's a good example for a bad attempt how not to try it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.