We have a code that was originally using a deque as a container. Unfortunately, it wasn't fast enough for our time measurements since it's taking quite some time for it to do a search or sort. So, I recently refactored the code to try and use a set instead. It's definitely faster in terms of searching and sorting than a deque.
One thing I noticed though is that the set was way slower when traversing all the elements. We have a test that basically just traverses the elements from beginning up to the end until it matches the values it is looking for and found out that the set is taking almost 5x the time it takes for the deque to do it.
Can someone provide an explanation as to why the set is slower? I assumed that the time would be around the same since it's simply a traversal of all the elements from start to end, but that's not the case. I already did a lot of searching but can't find anything about a set being slower in traversing its elements.
Update: The set/deque contains a class that basically has two integer member variables.
class Element { uint32_t id; uint32_t time; }; Compile options: -g -pipe -funit-at-a-time -O3 --param inline-unit-growth=10000 --param large-function-growth=10000 --param max-inline-insns-single=10000 -Wall -Wextra -Wno-aggregate-return -Wno-padded -Wno-reorder -Wno-sign-compare -Wno-unused-parameter -Wcast-align -Wcast-qual -Wdisabled-optimization -Wfloat-equal -Wno-inline -Wlarger-than-10000 -Wmissing-format-attribute -Wpointer-arith -Wredundant-decls -Wno-unknown-pragmas
std::set::findinstead?std::find. I am talking about usingstd::set::find.dequeuestores most of its contents sequentially in memory, making it a very cache-friendly structure (for iteration). Aset, OTOH, may well be storing each item in the set at a different/arbitrary location in memory, which would mean that there is little or no locality-of-reference during a set-traversal, and therefore your CPUs' caches are not as effective at hiding the RAM subsystem's inherent latency during a set-traversal as they were during a dequeue-traversal.set's strength is not traversing, it is searching / insertion / deletion.