C++0x issue: Constant time insertion into std::set

Question

According to this page, I can achieve constant time insertion if I use

iterator std::set::insert ( iterator position, const value_type& x );

and the position iterator I provide directly "precedes" the proper (in-order) insertion point.

Now the case I'm concerned with is if I know that the value I'm inserting goes at the end (since it's the largest), e.g.:

set<int> foo = {1, 2, 3}; foo.insert(4); // this is an inefficient insert

According to the above criterion I should pass the last element foo.end()-1 to insert not foo.end(). Is my understanding correct? What happens if I pass foo.end()? Will it be a O(log n) insertion or a O(1) one. So, the options are:

// Option A foo.insert(foo.end()-1, 4); // Option B foo.insert(foo.end(), 4); // Safer version of Option A if(foo.empty()) foo.insert(4); else foo.insert(foo.end()-1, 4);

I ask because I'm writing a function that's templated on the container. I want to insert an element (that I know is the largest) to the end of whatever container is passed in. Using "Option A" above has a different behavior for a container like vector:

foo.insert(foo.end()-1, 4); // result is {1, 2, 3, 4} if foo is an std::set // result is {1, 2, 4, 3} if foo is an std::vector

As @Bo_Persson suggests, the problem here is that C++03 says "logarithmic in general, but amortized constant if t is inserted right after p." while C++0x says "logarithmic in general, but amortized constant if t is inserted right before p."

PS: I'm using GCC 4.5 on Ubuntu 11.04 with C++0x support enabled.

Edit: I ran empirical tests with C++0x support enabled and disabled and posted the results in an answer. Basically, the conclusion is that it's just as good (and is obviously safer) to provide end() as the insertion hint. However, that's obviously just an empirical observation. The standard, as stated, is still confusing on this aspect.

antonakos · Accepted Answer · 2011-07-03 22:42:39Z

Is it cheating to run a test instead of reading through library specifications?

For g++-4.4 -O2 for the integers 0 <= i < 5000000 my running times for standard insertion are

real 0m14.952s user 0m14.665s sys 0m0.268s

and my running times for insertion using end() as hint are

real 0m4.373s user 0m4.148s sys 0m0.224s

Insertion at end() - 1 is just as fast as far as I can tell, but it is more cumbersome to use because end() - 1 is an illegal operation (you have to use operator--()) and it crashes if the set happens to be empty.

#include <set> typedef std::set<int> Set; void insert_standard(Set& xs, int x) { xs.insert(x); } void insert_hint_end(Set& xs, int x) { xs.insert(xs.end(), x); } int main() { const int cnt = 5000000; Set xs; for (int i = 0; i < cnt; i++) { // insert_hint_end(xs, i); insert_standard(xs, i); } }

Great test. I'm assuming that's with C++0x support not enabled.

Bo Persson · Accepted Answer · 2011-07-03 12:05:32Z

3

It is not totally clear if the position should be pointing before or after the insertion point. Some implementations work with either.

On the other hand, if you want different behavior for different containers, why don't you just write two overloads for your function, one for containers with a push_back function and one for std::set.

answered Jul 3, 2011 at 12:05

Bo Persson

92.9k31 gold badges153 silver badges211 bronze badges

9 Comments

Kerrek SB Over a year ago

It'd be nice to check a couple of current implementations to see how this is handles. I believe that GCC had a change where they went from "checking a couple of neighbouring values" to checking only the hint and then aborting. But since end() can't be evaluated, I would hazard a guess that it can be used to successfully hint insertion of an element that's larger than all existing ones; it'd be a sensible and easy thing to implement.

Bo Persson Over a year ago

A problem here is that C++03 says "logarithmic in general, but amortized constant if t is inserted right after p." while C++0x says "logarithmic in general, but amortized constant if t is inserted right before p." What should a poor compiler do? :-)

Matthieu M. Over a year ago

@Bo Persson: I would rather say, what should a poor library writer do ? And it seems "obvious" that he should follow the standard for which he codes, ie use C++0x standard for the C++0x flavor of the library (the one with emplace* methods and all). I think this change originated in order to accomodate the end() case more easily.

Bo Persson Over a year ago

@Matthieu - But that means breaking old code, which is not nice. I think the change might be caused by a new C++0x requirement for multiset/map to keep equivalent nodes in insert order.

Ken Bloom Over a year ago

The C++0x version fixes set<int> foo = {1, 2, 3}; foo.insert(some_iterator, 0) by making it actually be possible to use foo.begin() as some_iterator?

|

Ben Voigt · Accepted Answer · 2011-07-04 01:03:57Z

Only supplying an iterator that falls immediately after the new value makes any sense.

That's because in a collection of N elements, there are N+1 possible insertion points. An iterator exists that comes after a value higher than any preexisting element, but there is no iterator that comes before a value before all elements.

I haven't thought of that. That's a great way of putting it. The standard for std::set should reflect this fact.

Alan Turing · Accepted Answer · 2011-07-03 23:46:55Z

Following in the footsteps of @antonakos, I'm expanding on the "cheating" solution and running an empirical test. I'm using GCC 4.5 with optimization (-02) and considering both the case when C++0x support is not enabled and when it is with -std=c++0x. Results on 40,000,000 insertions are as follows (showing system time as the other values in this case are not special):

Without C++0x support
- No hint: 26.6 seconds
- Hint at end(): 5.71 seconds
- Hint at --end(): 5.84 seconds
With C++0x support enabled
- No hint: 29.2 seconds
- Hint at end(): 5.34 seconds
- Hint at --end(): 5.54 seconds

Conclusion: GCC (with or without C++0x enabled) inserts efficiently when end() is provided as the insertion hint.

The code I used is based on @antonakos's:

#include <set> typedef std::set<int> Set; void insert_standard(Set & xs, int x) { xs.insert(x); } void insert_hint_end(Set & xs, int x) { xs.insert(xs.end(), x); } void insert_hint_one_before_end(Set & xs, int x) { xs.insert(--xs.end(), x); } int main() { const int cnt = 40000000; Set xs; xs.insert(0); for (int i = 1; i < cnt; i++) { //insert_standard(xs, i); //insert_hint_one_before_end(xs, i); insert_hint_end(xs, i); } return 0; }

Collectives™ on Stack Overflow

C++0x issue: Constant time insertion into std::set

4 Answers 4

1 Comment

9 Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

9 Comments

1 Comment

Comments

Linked

Related