12

Currently I'm reading C++1y papers, for now I'm trying to understand the n3873 paper titled Improved insertion interface for unique-key maps. The paper states that there's a problem with insert and emplace methods, it illustrates the problem with the following example:

std::map<std::string, std::unique_ptr<Foo>> m; m["foo"]; std::unique_ptr<Foo> p(new Foo); auto res = m.emplace("foo", std::move(p)); 

And after the code above, it express the following:

What is the value of p? It is currently unspecified whether p has been moved-from. (The answer is that it depends on the library implementation.)

Well, I'm having troubles while looking for the explanation of the previous quote, mainly because I'm unable to find where in the standard is specified that in a code like the above to move or not to move p is implementation defined; looking to the n3690 standard associative containers section (23.2.4) about the emplace(args) (Inserts a value_type object t constructed with std::forward<Args>(args)) and insert(t) methods only mentions that the value is inserted or emplaced...

... if and only if there is no element in the container with key equivalent to the key of t.

Not a word about moving (or not) the t value; on the other hand, the p managed memory is freed anyways (if it is moved p is freed after the no-insertion, and if isn't moved is freed ad the end of the scope) isn't it?


After the introduction, let me ask the following questions:

  • Why moving a value while inserting/emplacing it into an associative container which already have the inserted key, sets the value in an unspecified state?
  • Where is worded that this operation is implementation-defined?
  • What happens with the p of the example? Is it really freed?

Please, try to forgive if the question looks silly or with an obvious answer, it could be due my lack of english understanding skills or because I'm not used to dive into the standard papers. Any guidance would be appreciated.

4 Answers 4

11

Not a word about moving (or not)

This is precisely the problem, it's left unspecified under what conditions the mapped_type will be moved.

Why moving a value while inserting/emplacing it into an associative container which already have the inserted key, sets the value in an unspecified state?

There's nothing preventing an implementation from moving the unique_ptr into a temporary variable first, and then searching for the key "foo". In this case, regardless of whether the map already contains the key or not, p == nullptr when the call to emplace returns.

Conversely, an implementation could conditionally move depending on whether the key exists or not. Then, if the key exists, p != nullptr when the function call returns. Both methods are equally correct, and in the first case there's no way to retrieve the original contents of p even if the insertion never takes place, it will be destroyed by the time emplace returns.

The proposed emplace_stable() and emplace_or_update() functions are to make the behavior predictable under all circumstances.

Where is worded that this operation is implementation-defined?

It's not specified as implementation defined, it's under specified, allowing implementations too much latitude, potentially resulting in behavior that's not always desirable.

What happens with the p of the example? Is it really freed?

In the example you've shown the contents of p will not be inserted into the map (since the key "foo" already exists). But p may or may not be moved from when the call to emplace returns.

There will never be a resource leak in any case. If the implementation unconditionally moves p it'll move it into a local copy, which will either be destroyed if the key exists, or inserted into the map if the key doesn't exist.

On the other hand, if the implementation conditionally moves p, it'll either be inserted into the map, or p will own it when emplace returns. In the latter case, it'll, of course, be destroyed when p goes out of scope.

Sign up to request clarification or add additional context in comments.

3 Comments

I love when all my questions were answered
w/r/t What happens with the p of the example? p will not be inserted, because m["foo"]; has already created a value_type with key "foo", but you still don't know if p will be moved from or not.
@bcrist Thanks, I did inadvertently skip over that part of the question. Updated the answer.
4

Move semantics in c++ are not related to emplace/insert methods. The latter are just one of the cases which uses move semantics to gain performance.

You should learn about rvalue references and move semantics in order to understand why p has undefined value after the line "m.emplace("foo", std::move(p));"

You can read in in detail for example here: http://www.slideshare.net/oliora/hot-c11-1-rvalue-references-and-move-semantics

In short, std::move(p) statement tells compiler that you do not care about p's contents anymore and totally okey that they will be moved somewhere else. In practice, std::move(p) converts p to rvalue reference type (T&&). rvalue existed in c++ before c++11 without having the "official" type. For example expression (string("foo") + string("bar")) produces rvalue which is a string with an allocated buffer containing "foobar". Before c++11 you could not use the fact that this expression is totally temporary and is going to vanish in a second (besides in compiler optimizations). Now you get this as part of the language:

v.emplace_back(string("foo") + string("bar")) 

is going to take the temporary string and move its contents directly into the container (no redundant allocations).

It works elegantly with temporary expressions but you can not do it directly with variables (which are the opposite of rvalues). However, in some cases you know that you do not need this variable anymore and you want to move it some where else. For that you use std::move(..) which tells the compiler to treat this variable as an rvalue. You need to understand that you can not use it afterwards. That is the contract between you and the compiler.

Comments

2

I think the third bullet of 17.6.4.9/1 [res.on.arguments] applies here (quoting N3936):

Each of the following applies to all arguments to functions defined in the C++ standard library, unless explicitly stated otherwise.

  • If an argument to a function has an invalid value (such as a value outside the domain of the function or a pointer invalid for its intended use), the behavior is undefined.
  • If a function argument is described as being an array, the pointer actually passed to the function shall have a value such that all address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array) are in fact valid.
  • If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument. [ Note: If the parameter is a generic parameter of the form T&& and an lvalue of type A is bound, the argument binds to an lvalue reference (14.8.2.1) and thus is not covered by the previous sentence. —end note ] [ Note: If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument move(x)), the program is effectively asking that function to treat that lvalue as a temporary. The implementation is free to optimize away aliasing checks which might be needed if the argument was an lvalue. —end note ]

By passing an rvalue expression referring to an object to a reference parameter, you are essentially giving the standard library permission to do whatever it likes with that object. It may move from the object, or not, or modify it in any other way that is convenient for the standard library implementation.

Comments

2

Contrary to what the linked article says, I would say the language of the standard almost guarantees that this code does the wrong thing: it moves the pointer from p, and then destroys the object originally pointed to by p because in the end nothing gets inserted into the map m (since the key constructed from "foo" is already present). [I say "almost" only because the language of the Standard is less clear than one should wish; obviously the question at hand simply wasn't on the mind of whoever wrote this.]

Citing from table 102 in 23.2.4, entry a_uniq.emplace(args), the effect is

Inserts a value_type object t constructed with std::forward<Args>(args)...if and only if there is no element in the container with key equivalent to the key of t.

Here value_type for the case of a std::map is std::pair<const Key, T>, in the example with Key equal to std::string and T equal to std::unique_ptr<Foo>. So the object t referred to is (or would be) constructed as

std::pair<const std::string, std::unique_ptr<Foo>> t("foo", std::move(p)); 

and the "key of t" is the first component of that pair. As the linked article indicates, the language is imprecise due to the conflation of “construct” and “insert”: one might construe that "if and only if" refers to both of them, and that therefore t is neither constructed nor inserted in case there is an element in the container with key equivalent to the key of t; then in this scenario nothing would be moved from p (because of the lack of construction) and p would not become null. However, there is a logical inconsistency in this reading of the cited phrase: if t should never be constructed, what on earth could the "key of t" refer to? Therefore I think the only reasonable reading of this text is: the object t is (unconditionally) constructed as indicated, and then t is inserted into the container if and only if there is no element in the container with key equivalent to the key of t. In the case t is not inserted (as in the example), the temporary will disappear on returning from the call to emplace, destroying the resource moved into it as it goes.

Of course this does not mean it is impossible for an implementation to do the right thing: separately construct the first (key) component of t, look up that key in the container, and only if it is not found construct the complete pair t (at this time moving the mapped-to object form p to the second component of t) and inserting that. (This does require that the key type is copy or move constructible, since what will become the first component of t is initially constructed in a different place.) It is exactly because such implementation is possible that the article proposes to provide a means to reliably ask for such behaviour. But the current language of the standard does not seem to give licence to such an implementation, and even less an obligation to behave like that.


Let me add that I ran into this problem in practice, because I naively thought that having a nice new method emplace it would certainly be defined to work well with move semantics. So I wrote something along the lines of:

auto p = m.emplace(key,std::move(mapped_to_value)); if (not p.second) // no insertion took place { /* some action with value p.first->second about to be overwritten here */ p.first->second = std::move(mapped_to_value) // replace mapped-to value } 

It turned out to be not so, and in my "mapped to" type, which happened to contain both a shared pointer and a unique pointer, the shared pointer component behaved fine, but the unique pointer component would become null in case a previous entry in the map was overwritten. Given that this idiom does not work, I rewrote it to

auto range = m.equal_range(key); if (range.first==range.second) // the key was previously absent; insert a pair m.emplace_hint(range.first,key,std::move(mapped_to_value)); else // the key was present, replace the associated value { /* some action with value range.first->second about to be overwritten here */ range.first->second = std::move(mapped_to_value) // replace mapped-to value } 

This is a reasonable work-around that works without much assumptions about the mapped-to type (notably it need not be default-constructible or copy-constructible, just move-constructible and move-assignable).

It looks like this idiom should even work for unordered_map, though I did not try it for that case. In fact looking closer, it works, but the use of emplace_hint is pointless, since unlike for the case of std::map, the method std::unordered_map::equal_range is obliged in case of an absent key to return a pair of iterators both equal to the (uninformative) value returned by std::unordered_map::end, rather than some other pair of equal iterators. Indeed it seems that std::unordered_map::emplace_hint, which is allowed to ignore the hint, is almost forced to do so, since either the key is already present and emplace_hint should do nothing (except gobble up the resources possibly moved into its temporary pair t), or else (no such key is present) there is no way to obtain a useful hint, since neither the methods m.find nor m.equal_range are allowed to return anything else than m.end() when invoked with a key that turns out to be absent.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.