6

I understand that the underlying data structure for map in C++ is a self-balancing Binary Search Tree. Since in these data structures, finding a lower bound and an upper bound to a key has lots of use, you'd think that map lower_bound, and upper_bound functions will give you that capability. It's a bummer that these functions don't deliver that. Does anyone know why lower_bound behaves the way it does? (it gives you the key that is NOT BEFORE the given key).

3
  • 1
    nusure what you expected... std::prev(m.lower_bound(key)) (ignoring bound check). Commented Apr 13, 2020 at 17:43
  • Because that is how upper and lower bound work in math? en.wikipedia.org/wiki/Upper_and_lower_bounds Commented Apr 13, 2020 at 20:58
  • I don't think the mathematical definition makes a huge relevance here. but don't want to get stuck in that discussion. My main point is that I have a lot of usage to find an element with largest key, smaller than a given key. The way it is right now, I have to find the lower_bound and back-iterate to find that. Commented Apr 15, 2020 at 23:55

3 Answers 3

13

I've been using C++ since before SGI even introduced the STL, and for some reason still manage to mess up using these methods, including even embarrassing myself when presenting them to a class. I think my problems are that:

  1. The names already have an intuitive but different meaning in mathematics. Given the mathematical meanings, it seems weird that in a big set or map, upper_bound and lower_bound are actually the same or adjacent elements.

  2. The names "upper_bound" and "lower_bound" sound like there is a some kind of symmetry between the two, when there is absolutely not. I'd have a much easier time if the names were something like least_ge (least greater than or equal to) for lower_bound and least_gt (least greater than) for upper_bound.

If someone has a mnemonic or logic to make these easy to internalize, please share it. Otherwise, it just feels like they wrote two useful functions but used two random mathematical terms to name those functions, with no way to derive the semantics from the names. At that point, why not use up made up names like egptr and pbase? I mean at least I don't have any pre-existing intuitions to overcome about the names of streambuf methods...

At any rate here are what I believe are the basic rules you have to remember:

  • lower_bound(X) returns the lowest element v such that v >= X

  • upper_bound(X) returns the lowest element v such that v > X

  • To traverse the half-open interval [L,H), start with lower_bound(L) and stop at (don't process) lower_bound(H). This is usually what you want, because it's most common to traverse half-open intervals in C++--e.g., [buf, buf+nbytes), or [0,array_size), or [begin(), end()).

  • To traverse the closed interval [L,H], start at lower_bound(L) and stop at upper_bound(H).

  • To traverse the open interval (L,H), start at upper_bound(L) and stop at lower_bound(H).

  • In a non-empty container, the mirror image of lower_bound(X) is std::prev(upper_bound(X)) and the mirror image of upper_bound(X) is std::prev(lower_bound(X)). Of course, if an element is equal to begin(), then you can't step it backwards with std::prev, so you need extra logic to deal with the fact that this point cannot be represented with an iterator value.

  • In a multiset/multimap, the first v is lower_bound(v) if that element is indeed v. The last v is std::prev(upper_bound(v)) if the container is not empty and that element is v, but remember to check that the container is not empty before attempting prev on end().

Sign up to request clarification or add additional context in comments.

4 Comments

cppreference.com's std::equal_range description got some hints: "...The first iterator may be alternatively obtained with std::lower_bound(), the second - with std::upper_bound()." Also its 'possible implementation' uses these functions such as make_pair(lower_bound(...), upper_bound(...)). @WangXiaojie's answer mentioned this as well. Also, see the answers to this question.
Seems like the root cause of the confusion is the uniformity of the interfaces. The terms "lower_bound" and "upper_bound" came from a more general concept, not only for the typical std::map. std::map contains unique keys - so there is no "range" of multiple equivalent elements and thus it's weird to say "lower"/"upper" bound. But std::map has these functions with the same names for interface uniformity and convenience.
@starriet주녕차 For me the main confusion is that c.lower_bound(x) can be greater than x. This is counter to uses of the term in other contexts (e.g., lattices, complexity theory, order theory) where elements can never be less than their lower bound. E.g., if you look up these terms by reading wikipedia on bounds en.wikipedia.org/wiki/Upper_and_lower_bounds then C++ is guaranteed to confuse you.
Since you in reasonable code have to check the bounds the std::prev() doesn't seem to provide any benefit over decrement.
3

This is not only in the map. It is in STL.

lower_bound for your x find such y that x <= y. And the upper_bound x < y.

1 Comment

Yeah, not quite sure what's the use of that. I was hoping to be able to give a key and find the largest key smaller than the given key. Map being implemented as an RB-Tree justifies this feature even more.
3

From the point view of usual math convention, the upper_bound is the "least true upper-bound" (never equal) and the lower_bound is the "least upper-bound" (could equal). The fact that lower_bound is actually an "upper-bound" in usual math convention may cause confusion among users.

A way to rationalize the name of lower_bound/upper_bound is considering them in the context of another method called equal_range. The lower_bound is really the "lower_bound" of the equal_range, similarly the upper_bound.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.