3

I am aware of several related questions, such as Parsing a comma-delimited std::string one. However, I have created a code that fits my specific need - to split the string (read from a file) at comma stripping any whitespaces. Later I want to convert these substrings to double and store in std::vector. Not all operations are shown. Here is the code I am giving.

include "stdafx.h" #include<iostream> #include<string> #include<vector> #include<algorithm> int main() { std::string str1 = " 0.2345, 7.9 \n", str2; str1.erase(remove_if(str1.begin(), str1.end(), isspace), str1.end()); //remove whitespaces std::string::size_type pos_begin = { 0 }, pos_end = { 0 }; while (str1.find_first_of(",", pos_end) != std::string::npos) { pos_end = str1.find_first_of(",", pos_begin); str2 = str1.substr(pos_begin, pos_end- pos_begin); std::cout << str2 << std::endl; pos_begin = pos_end+1; } } 

Output:

0.2345 7.9 

So the program goes like this. While loop searches for occurrence of , pos_end will store first occurrence of ,, str2 will be a substring, pos_begin will go to one next to pos_end. First iteration will run fine.

In the next iteration, pos_end will be very large value and I am not sure what pos_end- pos_begin will be. Same goes with pos_begin (though it will be unused). Is making some checks, such as

if (pos_end == std::string::npos) pos_end = str1.length(); 

a way to go?

The program works on though (g++ -Wall -Wextra prog.cpp -o prog -std=c++11). Is this approach correct?

16
  • 1
    What about std::istringstream and std::getline() with an appropriate delimiter? Commented Feb 15, 2018 at 23:56
  • Can it work for variable length string reading from file? Commented Feb 15, 2018 at 23:59
  • Looks reasonable. This might be a question better asked at codereview. Note that I linked to the how to ask page. This is on purpose. Make sure you comply with their rules before posting. Commented Feb 16, 2018 at 0:00
  • 1
    "Can it work for variable length string reading from file?" pick it up and move it into a function that takes a string and find out. Commented Feb 16, 2018 at 0:00
  • 1
    You might want to look at the answers to How can I read and parse CSV files in C++? and just skip the step of reading lines from the file into the string. Commented Feb 16, 2018 at 0:06

2 Answers 2

2

I use ranges library in c++20 and implement like bellow:

#include <iostream> #include <ranges> #include <algorithm> #include <vector> auto join_character_in_each_subranges = [](auto &&rng) { return std::string(&*rng.begin(), std::ranges::distance(rng)); }; auto trimming = std::ranges::views::filter([](auto character){ return !std::isspace(character);}); int main() { std::string myline = " 0.2345, 7.9 "; std::vector<double> line_list; for (std::string&& words : myline | std::ranges::views::split(',') | std::ranges::views::transform(join_character_in_each_subranges)) { auto words_trimming = words | trimming; std::string clean_number; std::ranges::for_each(words_trimming, [&](auto character){ clean_number += character;}); line_list.push_back(atof(clean_number.c_str())); } } 

First, iterate on myline sentences and splits the view into subranges on the delimiter

 myline | std::ranges::views::split(',') 

get each subrange and append each character to each other and view into the std::string with transform function

std::transform applies the given function to a range and stores the result in another range.

 std::ranges::views::transform(join_character_in_each_subranges) 

also, remove any prefix and suffix from view ranges

auto words_trimming = words | trimming; 

and convert view ranges to std::string with

std::ranges::for_each(words_trimming, [&](auto character){ clean_number += character;}); 

finally, convert each clean_number to double and push_back into the list.

line_list.push_back(atof(clean_words.c_str())); 
Sign up to request clarification or add additional context in comments.

Comments

1

Your erase idiom may fail to compile on more modern compilers because isspace is overloaded. At certain point removing whitespaces using range-for might be more effective. Algorythm in question depends whether you need or not to store tokens and correct "syntax" errors in line and store or not empty token.

#include<iostream> #include<string> #include<list> #include<algorithm> typedef std::list<std::string> StrList; void tokenize(const std::string& in, const std::string& delims, StrList& tokens) { tokens.clear(); std::string::size_type pos_begin , pos_end = 0; std::string input = in; input.erase(std::remove_if(input.begin(), input.end(), [](auto x){return std::isspace(x);}),input.end()); while ((pos_begin = input.find_first_not_of(delims,pos_end)) != std::string::npos) { pos_end = input.find_first_of(delims,pos_begin); if (pos_end == std::string::npos) pos_end = input.length(); tokens.push_back( input.substr(pos_begin,pos_end-pos_begin) ); } } int main() { std::string str = ",\t, 0.2345,, , , 7.9 \n"; StrList vtrToken; tokenize( str, "," , vtrToken); int i = 1; for (auto &s : vtrToken) std::cout << i++ << ".) " << s << std::endl; return 0; } 

Output:

1.) 0.2345 2.) 7.9 

This variant strips all empty token. Whether is right or not is unknown in your context, so there is no correct answer. If you have to check if string was correct, or if you have replace empty tokens with default values, you have to add additional checks

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.