151

Now that std has a real hash map in unordered_map, why (or when) would I still want to use the good old map over unordered_map on systems where it actually exists? Are there any obvious situations that I cannot immediately see?

0

5 Answers 5

130

As already mentioned, map allows to iterate over the elements in a sorted way, but unordered_map does not. This is very important in many situations, for example displaying a collection (e.g. address book). This also manifests in other indirect ways like: (1) Start iterating from the iterator returned by find(), or (2) existence of member functions like lower_bound().

Also, I think there is some difference in the worst case search complexity.

  • For map, it is O( lg N )

  • For unordered_map, it is O( N ) [This may happen when the hash function is not good leading to too many hash collisions.]

The same is applicable for worst case deletion complexity.

Sign up to request clarification or add additional context in comments.

2 Comments

you are right about worst case, but this post is somehow misleading - as on average std::unordered_map is O(1) for search complexity which is much better then std::map
It is very important to understand that for some applications, worst case performance is crucial to know and is the deciding factor. For some hard real-time systems, having a linear worst case like the hashtable is not acceptable. std::map is always O(lg N), which is a very nice property to have.
101

In addition to the answers above you should also note that just because unordered_map is constant speed (O(1)) doesn't mean that it's faster than map (of order log(N)). The constant may be bigger than log(N) especially since N is limited by 232 (or 264).

So in addition to the other answers (map maintains order and hash functions may be difficult) it may be that map is more performant.

For example in a program I ran for a blog post I saw that for VS10 std::unordered_map was slower than std::map (although boost::unordered_map was faster than both).

Performance Graph

Note 3rd through 5th bars.

8 Comments

what is the value of N in this graph?
The blog link has gone the way of the dodo, and the results presented here are of little value without that context, as the time needed to hash vs. compare things varies hugely with the exact hash function, data type, length, and values. That's particularly important with the VC++ Standard Library, as hash functions are fast but collision prone: numbers passed through unaltered, only 10 characters spaced along a string of any length are combined in the hash value, bucket counts aren't prime. (GNU is at the opposite end of the spectrum).
VS10, that's your problem right here.
@n.m. Sadly my time machine wasn't working on Oct 11 2010.
This is a perfectly valid post. I have found this myself. It's a good example of theory vs reality. In theory a hash map should be faster, but they might actually be slower. Never take for granted what textbooks nor what the "experts" say. Especially for performance desktop CPUs caching can be a big factor in container algorithms. Theory can be way off from reality. For when performance really matters profile things yourself and experiment trying the various options. You might be surprised what you find. Plus there are down to clock cycle profiling tools for most platforms to tune things.
|
36

Partly based on Chandler Carruth's CppCon 2014 talk.

std::map is (considered by many to be) not useful for performance-oriented work:

  • If you want O(1)-amortized access, use a proper hash-map (which will probably use open addressing).
  • If you want O(1)-amortized access but can't be bothered to get a decent implementation of anything at least use std::unorderded_map rather than std::map.
  • If you want sorted sequential access into your map (which you typically don't), use something array-ish (e.g. based on an std::vector, or the result of a new[] and so on).

Also, std::map is implemented as a balanced tree; and you have to traverse it, or re-balance it, incredibly often. These are cache-killer and cache-apocalypse operations respectively... so just say NO to std::map.

You might be interested in this SO question on efficient hash map implementations.

(PS - std::unordered_map is cache-unfriendly because it uses linked lists as buckets.)

2 Comments

If you need ordered stuff (including range scans), std::map (ok, these days it is absl b-tree-based map) is the best thing you can have for performance. If not - most of the time, F14 or absl Swiss map will be the best choice.
@No-BugsHare: Remember this question is about unordered data. Also, even with respect to ordered data, and in some cases, std::map may not be very performant. Like you said, there are other ordered map implementations which seem to be faster.
22

I think it's obvious that you'd use the std::map you need to iterate across items in the map in sorted order.

You might also use it when you'd prefer to write a comparison operator (which is intuitive) instead of a hash function (which is generally very unintuitive).

Comments

15

Say you have very large keys, perhaps large strings. To create a hash value for a large string you need to go through the whole string from beginning to end. It will take at least linear time to the length of the key. However, when you only search a binary tree using the > operator of the key each string comparison can return when the first mismatch is found. This is typically very early for large strings.

This reasoning can be applied to the find function of std::unordered_map and std::map. If the nature of the key is such that it takes longer to produce a hash (in the case of std::unordered_map) than it takes to find the location of an element using binary search (in the case of std::map), it should be faster to lookup a key in the std::map. It's quite easy to think of scenarios where this would be the case, but they would be quite rare in practice i believe.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.