9

Possible Duplicate:
How do I tokenize a string in C++?

Hello I was wondering how I would tokenize a std string with strtok

string line = "hello, world, bye"; char * pch = strtok(line.c_str(),","); 

I get the following error

error: invalid conversion from ‘const char*’ to ‘char*’ error: initializing argument 1 of ‘char* strtok(char*, const char*)’ 

I'm looking for a quick and easy approach to this as I don't think it requires much time

1
  • I have seen this kind of questions before. Possible duplicate. Commented Sep 27, 2012 at 19:11

4 Answers 4

18

I always use getline for such tasks.

istringstream is(line); string part; while (getline(is, part, ',')) cout << part << endl; 
Sign up to request clarification or add additional context in comments.

Comments

11
std::string::size_type pos = line.find_first_of(','); std::string token = line.substr(0, pos); 

to find the next token, repeat find_first_of but start at pos + 1.

1 Comment

With this, there would have to be another variable to keep track of pos1 and pos2. Otherwise, you would be using substring from 0 to what the new pos is, instead of pos1 to pos2.
4

You can use strtok by doing &*line.begin() to get a non-const pointer to the char buffer. I usually prefer to use boost::algorithm::split though in C++.

6 Comments

I think by discarding the const on the internal pointer of the string, you allow strtok to modify the string's internal pointer - very dirty.
This is a terrible idea. It will put the std::string into an undefined state. You are not supposed to modify a std::string using C string functions.
@japreiss How can it possibly go wrong? There's nothing wrong with modifying the characters in a string through its iterators, and C++ strings are always contiguous in practice, and are guaranteed to be contiguous and null-terminated in C++11.
@spencercw: There is no guarantee that the string's internal representation is zero-terminated; and it might use copy-on-write semantics, in which case subverting const could change other copies of the string. It might (or might not) be possible to demonstrate that what you're doing is well-defined for any conformant implementation, but even if you can, I wouldn't like to test the edge cases of conformance like that.
@MikeSeymour The internal buffer is guaranteed to be null-terminated in C++11 (see this answer). You raise an interesting point with copy-on-write though. I would guess that in such an implementation dereferencing the iterator would trigger the copy, or some sort of memory guard would trigger the copy anyway when strtok writes to the buffer. Are there any implementations that actually do CoW?
|
1

strtok is a rather quirky, evil function that modifies its argument. This means that you can't use it directly on the contents of a std::string, since there's no way to get a pointer to a mutable, zero-terminated character array from that class.

You could work on a copy of the string's data:

std::vector<char> buffer(line.c_str(), line.c_str()+line.size()+1); char * pch = strtok(&buffer[0], ","); 

or, for more of a C++ idiom, you could use a string-stream:

std::stringstream ss(line); std::string token; std::readline(ss, token, ','); 

or find the comma more directly:

std::string token(line, 0, line.find(',')); 

1 Comment

There is no std::readline(). Did you mean std::getline()?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.