Dictionary vs List for looking up an item

Question

I was under the impression that it was quicker to look up items in a Dictionary than a List, the follow code seems to suggest otherwise:

Dictionary : 66 ticks

List: 32 ticks

Im assuming ive screwed up somewhere?

static void Main(string[] args) { // Speed test. Dictionary<string, int> d = new Dictionary<string, int>() { {"P1I1-1MS P2I1-1MS 3I-1MS 4I-1MS", 2}, {"P1I2-1MS P2I1-1MS 3I-1MS 4I-1MS", 1}, {"P1I3-1MS P2I1-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I1-1MS 3I-1MS 4I-1MS", -1}, {"P1I5-1MS P2I1-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I2-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I3-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I5-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I2-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I3-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I5-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS", 2} }; List<string> l = new List<string>(); l.Add("P1I1-1MS P2I1-1MS 3I-1MS 4I-1MS"); l.Add("P1I2-1MS P2I1-1MS 3I-1MS 4I-1MS"); l.Add("P1I3-1MS P2I1-1MS 3I-1MS 4I-1MS"); l.Add("P1I4-1MS P2I1-1MS 3I-1MS 4I-1MS"); l.Add("P1I5-1MS P2I1-1MS 3I-1MS 4I-1MS"); l.Add("P1I1-1MS P2I2-1MS 3I-1MS 4I-1MS"); l.Add("P1I2-1MS P2I2-1MS 3I-1MS 4I-1MS"); l.Add("P1I3-1MS P2I2-1MS 3I-1MS 4I-1MS"); l.Add("P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS"); l.Add("P1I5-1MS P2I2-1MS 3I-1MS 4I-1MS"); l.Add("P1I1-1MS P2I3-1MS 3I-1MS 4I-1MS"); l.Add("P1I2-1MS P2I3-1MS 3I-1MS 4I-1MS"); l.Add("P1I3-1MS P2I3-1MS 3I-1MS 4I-1MS"); l.Add("P1I4-1MS P2I3-1MS 3I-1MS 4I-1MS"); l.Add("P1I5-1MS P2I3-1MS 3I-1MS 4I-1MS"); l.Add("P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS"); Stopwatch sw = new Stopwatch(); string temp = "P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS"; bool inDictionary = false; sw.Start(); if (d.ContainsKey(temp)) { sw.Stop(); inDictionary = true; } else sw.Reset(); Console.WriteLine(sw.ElapsedTicks.ToString()); Console.WriteLine(inDictionary.ToString()); bool inList = false; sw.Reset(); sw.Start(); if (l.Contains(temp)) { sw.Stop(); inList = true; } else sw.Reset(); Console.WriteLine(sw.ElapsedTicks.ToString()); Console.WriteLine(inList.ToString()); Console.ReadLine(); }

EDIT

Modification to Matthew Watson's code.

Are you running this test in the debugger or the executable? What about the mode, debug/release? — DGibbs
– DGibbs, Commented Aug 15, 2013 at 8:09
Yeah its being run in debugger (debug). .net 4 client profile — Hans Rudel
– Hans Rudel, Commented Aug 15, 2013 at 8:10
One execution and case is way to little to provide valid data. You should run the search 10k times and have it search for different matches. Your matches are all the same length and contain much the same data, this may reduce some of the dictionary benefits. — MrFox
– MrFox, Commented Aug 15, 2013 at 8:12
@HansRudel I've just run the test in release and not in the debugger over 1000000 iterations and dictionary is definitely faster: Dict: 97628.10ns vs List: 10396.84ns though the difference is not much. I suspect this is due to the relatively small sizes of each collection. — DGibbs
– DGibbs, Commented Aug 15, 2013 at 8:25

Matthew Watson · Accepted Answer · 2013-08-15 08:22:39Z

Here's a proper test:

Dictionary<string, int> d = new Dictionary<string, int>() { {"P1I1-1MS P2I1-1MS 3I-1MS 4I-1MS", 2}, {"P1I2-1MS P2I1-1MS 3I-1MS 4I-1MS", 1}, {"P1I3-1MS P2I1-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I1-1MS 3I-1MS 4I-1MS", -1}, {"P1I5-1MS P2I1-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I2-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I3-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I5-1MS P2I2-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I2-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I3-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I4-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I5-1MS P2I3-1MS 3I-1MS 4I-1MS", 0}, {"P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS", 2} }; List<string> l = new List<string> { "P1I1-1MS P2I1-1MS 3I-1MS 4I-1MS", "P1I2-1MS P2I1-1MS 3I-1MS 4I-1MS", "P1I3-1MS P2I1-1MS 3I-1MS 4I-1MS", "P1I4-1MS P2I1-1MS 3I-1MS 4I-1MS", "P1I5-1MS P2I1-1MS 3I-1MS 4I-1MS", "P1I1-1MS P2I2-1MS 3I-1MS 4I-1MS", "P1I2-1MS P2I2-1MS 3I-1MS 4I-1MS", "P1I3-1MS P2I2-1MS 3I-1MS 4I-1MS", "P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS", "P1I5-1MS P2I2-1MS 3I-1MS 4I-1MS", "P1I1-1MS P2I3-1MS 3I-1MS 4I-1MS", "P1I2-1MS P2I3-1MS 3I-1MS 4I-1MS", "P1I3-1MS P2I3-1MS 3I-1MS 4I-1MS", "P1I4-1MS P2I3-1MS 3I-1MS 4I-1MS", "P1I5-1MS P2I3-1MS 3I-1MS 4I-1MS", "P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS" }; int trials = 4; int iters = 1000000; Stopwatch sw = new Stopwatch(); string target = "P1I1-1MS P2I4-1MS 3I-1MS 4I-1MS"; for (int trial = 0; trial < trials; ++trial) { sw.Restart(); for (int i = 0; i < iters; ++i) d.ContainsKey(target); sw.Stop(); Console.WriteLine("Dictionary took " + sw.Elapsed); sw.Restart(); for (int i = 0; i < iters; ++i) l.Contains(target); sw.Stop(); Console.WriteLine("List took " + sw.Elapsed); }

Run a RELEASE build of this OUTSIDE any debugger.

My results:

Dictionary took 00:00:00.0587588 List took 00:00:00.2018361 Dictionary took 00:00:00.0578586 List took 00:00:00.2003057 Dictionary took 00:00:00.0611053 List took 00:00:00.2033325 Dictionary took 00:00:00.0583175 List took 00:00:00.2056591

Dictionary is clearly faster, even with so few entries.

With more entries, Dictionary will be even faster compared to list.

The lookup time using a Dictionary is O(1), whereas a List takes O(N). That will of course make a HUGE difference for large values of N.

Hi, thanks for the code. Ive run it and get similar times to you. Ive just modified it so that in each For-Loop, it generates chooses a new target via target = l[random.Next(l.Count)]; but the times are now ~ the same. Any idea why? See edit to original question for picture.
@HansRudel Well, firstly you neglected to add braces {} so the loops are both just accessing the list (and doing nothing with the dictionary inside the loop). Secondly, if you do add the random lookup inside the loop, you will also have added to the loop the amount of time it takes to index a List<>, which will skew the results. Furthermore, as soon as you add a random component to a list you destroy any consistency - the targets randomly chosen for one of the loops may be much "worse" than the ones randomly chosen for the other loop.
Updated the times. Regarding your 2nd point, id have thought that since there is a sufficiently large number of iterations, this problem would be negligible. Anyway, thanks for your help, i appreciate it.

Dmitrii Bychenko · Accepted Answer · 2013-08-15 08:17:44Z

Dictionary is faster than List in assimptotic meaning of the word: O(1) < O(n); it means that there's such a size X from which on Dictionary starts being faster than List. E.g. if Dictionary/List contains, say, 10000 or more items, Dictionary is always faster (if there're fewer items to store, say, 5, List can well be faster). There's another issue with Dictionary: it requires GetHashCode() method; if GetHashCode() is bad implemented Dictionary can be deadly slow.

Am i correct in assuming then that "P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS" is a poor key? Each of the objects im dealing with contains x inputs and the list of these inputs = 1 permutation => being unique. So i thought that might be a good way to store which object i had already delt with as i had read about hash conflicts.
No, default GetHashCode() implementations (in your case - String.GetHashCode()) are quite good; it seems that you test Dictionary/List with too few items. Try, say, add 100000 different keys/items into List and Dictionary...

Community · Accepted Answer · 2017-05-23 12:19:42Z

This is due to the law of large numbers. In short

According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

Another constrain is the Big-O notation , which is in fact useless on a small scale. For example you can say that O(1) ~ O(N) ~ O(n!) for a given n that is less than some small number.

Running a good experiment requires meeting some pretty strict conditions like:

Make sure algorithms are comparable
Have large number of iterations in the algorithm
Run on the same hardware
Run in release mode with max optimization
No Debugger or Performance analysis tools should be attached
Conduct many experiments and calculate average + standard deviation
...

All very valid points. I will keep a list of them for next time. Thanks

Collectives™ on Stack Overflow

Dictionary vs List for looking up an item

3 Answers 3

4 Comments

2 Comments

1 Comment

Linked

Hot Network Questions