0

How to check if a value is in a on-the-fly created set. I'm looking for some syntactic sugar, like we have in python

if s in set(['first','second','third','fourth']): print "It's one of first,second,third,fourth"; 

How can it be done efficiently in C++?

6
  • 6
    Can you explain what's wrong with find? Commented Nov 16, 2018 at 16:14
  • sorry, I've just corrected my code to show that I want the most optimal way. I'm looking for something that can be done in one line - in the if. Commented Nov 16, 2018 at 16:15
  • 14
    @pawel_j Almost anything can be done in one line in c++ if your line is long enough. Commented Nov 16, 2018 at 16:16
  • 4
    "I want the most optimal way. I'm looking for something that can be done in one line"... one thing has nothing to do with the other. Commented Nov 16, 2018 at 16:18
  • 1
    std::set::count method is a way to go, if you have rvalue of type set. Commented Nov 16, 2018 at 16:21

7 Answers 7

5

How about this:

std::string s = "first"; if(std::set<std::string>{"first","second","third","fourth"}.count(s)>=1){ std::cout << s << " is found" << std::endl; } 

BTW, in C++20 and over I think std::set::contains is more preferable.

Sign up to request clarification or add additional context in comments.

7 Comments

Why not unordered_set?
@scohe001 For a set of 4 elements, you would have to measure the overheads to see which is better. Even an std::vector or std::array might be a contender at that number of elements.
Alternatively, in C++20 .contains(s) can be used.
@FrançoisAndrieux meh fair enough.
@pawel_j count() will not iterate over all elements. For plain std::set (as opposed to std::multiset) it's no different from .contains() (except one looks prettier than the other).
|
5

If you want this to be efficient then you are not going to want to construct a container just to check if the value exists. What we can do is leverage C++17's fold expressions and write a function like

template<typename... Args, std::enable_if_t<std::conjunction_v<std::is_same<const char*, Args>...>, bool> = true> bool string_exists(std::string to_find, Args... args) { return ((to_find == args) || ...); } 

which then lets you write the if statement like

int main() { std::string element; // get value of element somehow if (string_exists(element, "1", "2", "3", "4")) std::cout << "found"; } 

And now no container and no std::string objects are created. If you want to accept string objects you could change the functions to

template<typename... Args, std::enable_if_t<std::conjunction_v<std::is_same<const char*, Args>...> || std::conjunction_v<std::is_same<std::string, Args>...>, bool> = true> bool string_exists(std::string to_find, Args... args) { return ((to_find == args) || ...); } int main() { std::string element; // get value of element somehow if (string_exists(element, some, other, string, variables)) std::cout << "found"; } 

Do note that the last example doesn't allow you to mix string literals with std::string's.

1 Comment

That's quite a good solution with the template folding, but not really an optimization when the number of the possible values is larger, providing that set was created once and later reused. Please see my answer below
3

I do not see anything wrong in using find method

std::set<std::string>aset {"first", "second", "third", "fourth"}; std::string s = "third"; if(aset.find(s) != aset.end()){ std::cout << "It's one of first,second,third,fourth"; } 

Comments

1

Something like this should work:

#include <iostream> #include <string> #include <unordered_set> void foo(const std::string& s) { if (std::unordered_set<std::string>{"1", "2", "3", "4"}.count(s) != 0) { std::cout << "It's one of 1,2,3,4\n"; } } 

In C++17 you can let deduction guides come into play to clean up the template argument (note that I need to use std::string literals to avoid deducing as const char*):

using namespace std::string_literals; if (std::unordered_set{"1"s, "2"s, "3"s, "4"s}.count(s) != 0) { ... } 

Comments

1

How can it be done efficiently in C++? [...] I'm looking for something that can be done in one line - in the if

In "one line - in the if" and "efficiently" are different things.

If you really want all in a line, I suggest the 0x5453's solution (also the Hiroki's one) that uses count.

I propose an alternative based on emplace() (but works also with insert())

if ( ! std::set<std::string>{"first", "second", "third", "fourth"}.emplace(s).second ) std::cout << "It's one of first,second,third,fourth" << std::endl; 

but I don't think it's very efficiently: think if you use it in a loop: every time you have to create a set and insert a new element; so you have to recreate the same set and insert another element.

It's better the solution based on count() but just because the compiler can optimize the code avoiding to recreate the set every time.

If you want efficiency, I suggest the haccks's solution (making also const the set, but I suspect that the compiler optimize anyway)

 std::set<std::string> const numSet {"first", "second", "third", "fourth"}; if ( numSet.find(s) != numSet.cend() ) std::cout << "It's one of first,second,third,fourth" << std::endl; 

or, better, avoid the std::set at all ad check using template folding, as suggested by NathanOliver (you need C++17 but, maybe loosing short circuiting, isn't really difficult adapt it in C++14).

1 Comment

yes, definitely one line is not optimal, as the set will be recreated every time.
0

I've been trying to check your solutions, using a benchmarking tool to see the outcomes. It seems that creating a set in-situ is a performance killer, that's why one should avoid such solution, which is otherwise quite fast in python (need to do performance tests in python too).

So, when using a pre-created set it's almost twice as fast as using multiple ifs. Test done for number of items = 10.

enter image description here

For items = 4, set is only 18% faster than multiple ifs. enter image description here

When creating the set in every if, the set code is almost 900% slower.

enter image description here

Link to the benchmark http://quick-bench.com/iqQJWxaRgQGh4Sry2YtfxdSqTio (but it might disappear so I also add the code tested

static void withSet(benchmark::State& state) { std::vector<std::string> l = {"1","2","9","10"}; int i = 0; std::set<std::string> s = {"1","2","3","4","5","6","7","8","9","10"}; for (auto _ : state) { const std::string t = l[i++%l.size()]; if (s.find(t) != s.end() ) { benchmark::DoNotOptimize(t); } } } // Register the function as a benchmark BENCHMARK(withSet); static void withIfs(benchmark::State& state) { std::vector<std::string> l = {"1","2","9","10"}; int i = 0; for (auto _ : state) { const std::string t = l[i++%l.size()]; if (t == "1" || t == "2" || t == "3" || t == "4" || t == "5" || t == "6" || t == "7" || t == "8" || t == "9" || t == "10" ) { benchmark::DoNotOptimize(t); } } } BENCHMARK(withIfs); 

--- edit ---

Additionaly I understood that count() in set is not equal iterating every element in set. This is the outcome of benchmark between find() != end() and count()

enter image description here

1 Comment

If you need to do it fast, consider using a pre-sorted std::vector with std::binary_search.
0

I post an alternative approach which is not a true one liner but shows good performance as tested bellow. The basic idea is same with @NathanOliver. The idea is initializing a data set {"first", ..., "fourth"} at a compile-time with static constexpr qualifier to get a run-time performance gain:

  • In C++17 and over, std::string_view is available and this has constexpr ctors. Thus it is natural to apply static constexpr std::array of std::string_view to improve the performance.

  • In addition, for such a small size data set ~10, naive linear search with cache locality of contiguous arrays sometimes beat other algorithms. So I deliberately use std::find here.

This is easily done using C++17's if statement with initializer as follows:

if(static constexpr std::array<std::string_view, 4> v = {"first", "second", "third", "fourth"}; std::find(v.cbegin(), v.cend(), s) != v.cend()) 

This can be summarized as the following compact and portable syntax sugar macro:

#include <tuple> #include <array> #include <string_view> #include <algorithm> #define IF_CONTAINS(str, ...) \ if(static constexpr \ std::array< \ std::string_view, \ std::tuple_size<decltype(std::make_tuple(__VA_ARGS__))>::value> \ v = {__VA_ARGS__}; \ std::find(v.cbegin(), v.cend(), str) != v.cend()) 

This macro IF_CONTAINS works fine as if-statement as follows:

Live DEMO

std::string s = "first"; IF_CONTAINS(s, "first", "second", "third", "fourth") { std::cout << s << " is found" << std::endl; } 

Performance Test 1

First, we test the performance of this approach with the above data set {"first", "second", "third", "fourth"}. Tests are done with Quick C++ Benchmark, gcc-8.2, C++17 and O3 optimization. The results are as follows:

  • const std::string s = "first"; (Live DEMO)

    1.1 times slower than naive implementation.

enter image description here

  • const std::string s = "second"; (Live DEMO)

    3.7 times faster than naive implementation.

enter image description here

  • const std::string s = "third"; (Live DEMO)

    1.9 times faster than naive implementation.

enter image description here

  • const std::string s = "fourth"; (Live DEMO)

    3.1 times faster than naive implementation.

enter image description here


Performance Test 2

Finally, we test the performance with a data set {"1", ...,"10"}. The test code is same with @pawel_j 's one. This test shows 2.7 times faster result than naive implementation:

Live DEMO

enter image description here

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.